About me

Hi, I'm Dwi Budi Setyonugroho. My journey began with an Engineering degree, where my thesis on risk assessment taught me early on how to approach data analysis. As my interest in the data science field grew, I pursued the IBM Data Analyst Professional Certificate to master the fundamentals of analytics, followed by the Microsoft Power BI Data Analyst Professional Certificate to deepen my expertise in data visualization. Finally, I completed the Google Advanced Data Analytics Professional Certificate, where I dove deeper into statistics and machine learning to build predictive models.

Subsequently, I joined virtual internships with three companies, applying those skills to solve complex challenges in credit risk modeling and revenue optimization. Currently, I work as a private driver, which has further honed my communication, crisis management, time management, and adaptability soft skills. I would be honored to bring my blend of knowledge to your team.

Technical Skill

Visual Studio Code (VS Code)

I utilize VS Code as my central integrated development environment for writing, debugging, and version-controlling code. I leverage extensions for Git integration, Markdown previewing, and Python linting to ensure code quality, while managing virtual environments (venv) and package installations directly via the integrated terminal.

Jupyter Notebook

I create reproducible research narratives that combine executable code, rich text explanations (Markdown), and visualizations into a single linear story. I use notebooks for rapid prototyping of Exploratory Data Analysis (EDA) and hypothesis testing, ensuring every step from data loading to model serialization is self-documented for non-technical review.

Google BigQuery

I architect serverless cloud data pipelines on Google BigQuery, ingesting and unifying disparate datasets into unified data table. I optimize performance through partitioning strategies and cost-effective query design, managing both Sandbox environments for prototyping and production-ready datasets.

SQL

I engineer complex transformation logic using SQL, leveraging persistent table creation to build unified "Single Source of Truth" datasets. My proficiency covers tiered conditional logic (CASE WHEN) for dynamic margin calculations, Common Table Expressions (CTEs) for maintainable code, and advanced multi-table joins to seamlessly merge transactional and dimensional data with high accuracy.

Power Query

I execute efficient in-tool ETL processes, performing advanced data shaping such as dynamic column type conversion, date extraction, and custom column creation without altering source data.

Python

I architect robust end-to-end data pipelines using Pandas to transform raw, messy datasets into analysis-ready formats through complex cleaning, strategic imputation, and leakage prevention. I leverage Matplotlib and Seaborn in tandem to rapidly prototype high-fidelity visualizations that uncover hidden risk patterns and communicate statistical distributions with clarity.

Machine Learning & Modeling

I implement strategic machine learning solutions using Scikit-Learn, specializing in diagnosing and resolving severe class imbalance. I perform rigorous feature engineering (One-Hot Encoding, Standard Scaling), conduct model selection between linear and ensemble approaches based on business KPIs, and ensure model integrity by auditing for data leakage.

Microsoft Power BI Desktop

I design executive-level, multi-page interactive dashboards that translate complex statistical findings into actionable business insights. My capabilities include developing high-impact KPI cards, building geospatial filled maps, and implementing filters for granular portfolio analysis.

Google Looker Studio

I build end-to-end business intelligence solutions connecting directly to BigQuery, featuring dynamic filter controls and real-time KPI scorecards with trend lines. My dashboards leverage advanced visualizations—including geo-spatial heatmaps, scatter plots, bar charts, and donut charts for composition—to transform complex data into actionable strategic narratives.

PowerPoint

I construct strategic executive decks that flow logically from Business Context to Methodology, Technical Implementation, and Strategic Recommendations. I integrate high-resolution dashboard screenshots with callout boxes and annotations to guide audience focus, utilizing corporate branding and clean layouts to deliver board-ready presentations.

Technical Documentation

I author comprehensive technical artifacts, including detailed Data Dictionaries that define schema granularity and explicit business logic formulas. I craft executive-level README files that serve as professional project landing pages, streamlining business context summarization, tech stacks, key insights, and deployment links to ensure seamless stakeholder handover and long-term project reproducibility.

Git & GitHub

I manage professional repository structures with logical hierarchies and comprehensive README.md files. I maintain clean commit histories with meaningful messages, track changes effectively to ensure reproducible deployment and portfolio presentation.

Drag to Browse1 / 12

Certification

DataCamp Certification
DataCamp Logo

SQL Associate Certification

Validated proficiency in essential SQL skills for data analysis, including data manipulation, joining tables, and summarizing datasets. This certification involves a rigorous timed exam and practical application of SQL queries to real-world business scenarios.

Technical Skill

Data ManipulationJoining TablesSummarizing DatasetsSQL Queries
Verify on DataCamp
ID:SQA0014758676777
Visit Official PageDataCamp Logo

Education

DataCamp Education
DataCamp Logo

SQL Fundamentals

DataCamp

In this learning journey, I explored intermediate and advanced SQL concepts, primarily using PostgreSQL. I learned how to logically design databases using normalization and dimensional modeling. I mastered complex querying techniques, including advanced joins, set operations, and various subqueries like CTEs. I also gained skills in data manipulation using CASE statements, date/string functions, and arrays. Finally, I learned to write powerful window functions for analytics and manage database views and roles.

Technical Skill

PostgreSQLDatabase DesignNormalizationDimensional ModelingAdvanced SQL JoinsSubqueries (CTEs)Window FunctionsData ManipulationDatabase Views & RolesRelational DatabasesSQL Optimization
Visit Official PageDataCamp Logo
Coursera Education
Google Logo

Google Advanced Data Analytics Professional Certificate

Google

This program included over 200 hours of instruction and hundreds of practice-based assessments, which helped me simulate real-world advanced data analytics scenarios that are critical for success in the workplace. The content was highly interactive and exclusively developed by Google employees with decades of experience in advanced data analytics and data science. Through a mix of videos, assessments, and hands-on labs, I was introduced to advanced data analytics tools and platforms and key technical skills required for an advanced role.

Technical Skill

Data ScienceMachine LearningFeature EngineeringA/B TestingStatisticsStatistical AnalysisRegression AnalysisSampling (Statistics)Statistical Hypothesis TestingData AnalysisExploratory Data AnalysisData EthicsData VisualizationData Visualization SoftwareTableau SoftwareData StorytellingData PresentationPython Programming
Visit Official PageCoursera Logo
Coursera Education
Microsoft Logo

Microsoft Power BI Data Analyst Professional Certificate

Microsoft

This program was uniquely mapped to key job skills required in a Power BI data analyst role. In each course, I consolidated what I learned by completing a project that simulated a real-world data analysis scenario using Power BI. I also completed a final capstone project where I showcased all my new Power BI data analytical skills by connecting to data sources to transform data into an optimized data model and demonstrating data storytelling through dashboards, reports and charts to solve business challenges and identify new opportunities.

Technical Skill

Power BIBusiness IntelligenceBusiness AnalyticsAdvanced AnalyticsStatistical ReportingSQLDatabase DesignData WarehousingData StorageData CollectionData IntegrityData ManipulationData ProcessingData QualityData VisualizationReport WritingTimelinesMicrosoft Excel
Visit Official PageCoursera Logo
Coursera Education
IBM Logo

IBM Data Analyst Professional Certificate

IBM

I gained hands-on expertise in the full data analysis lifecycle, from wrangling datasets with SQL and Python to visualizing insights in Excel and Cognos. My projects included analyzing vehicle inventory with pivot tables, creating interactive KPI dashboards, and extracting financial data with Pandas. I also built regression models to predict housing prices and developed a dynamic Python dashboard for flight reliability, effectively bridging technical rigor with data storytelling.

Technical Skill

Python ProgrammingSQLPlotlyWeb ScrapingGenerative AIData AnalysisExploratory Data AnalysisData WranglingData ManipulationData Import/ExportData VisualizationData Visualization SoftwareInteractive Data VisualizationData PresentationData StorytellingDashboardMicrosoft ExcelExcel FormulasIBM Cognos Analytics
Visit Official PageCoursera Logo
MySkill Education
MySkill Logo

Microsoft Excel Professional Skill Certificate

MySkill

Completed a Full Learning Path with Professional Skill during 59 hours in Microsoft Excel. This comprehensive program covers Excel from basic to advanced levels, including data manipulation, forecasting, regression, and statistical analysis. Gained hands-on experience with data cleansing, visualization, pivot tables, Power Pivot, VBA macros, and advanced analytical techniques.

Technical Skill

Data CleansingSorting & FilteringData FormattingAggregate FunctionsConditional IFVLOOKUP & HLOOKUPINDEX MATCHPivot TablesData VisualizationMath FunctionsDate & Time ManipulationLogical FunctionsDynamic ArraysCell ReferencingPower PivotWhat-If AnalysisMacro VBAAutomationData ManipulationTime Series AnalysisLinear RegressionForecastingDescriptive Statistics
View Certificate
ID:MS-3/9/2025-FWQ2FpFLDH393DhfzzzY
Visit Official PageMySkill Logo
Bachelor’s Education

Bachelor of Engineering in Geological Engineering

Institut Sains & Teknologi AKPRIND Yogyakarta

GPA: 3.34 / 4.00

My engineering training provided a rigorous quantitative foundation, emphasizing spatial logic, complex system modeling, and statistical risk assessment. This background honed my ability to decompose ambiguous real-world problems into structured data models, directly translating geological field methodologies into modern data-driven decision-making frameworks.

Organizational Leadership

Human Resource Development Dept: Designed seminar, workshop, and research to develop soft skills and hard skills for the members.

Media & Publication Dept: Managed and created graphic design and visual communication strategies for organizational social media.

Relevant Coursework

GeostatisticsGeocomputation & GISResearch MethodologyMathematics I & IIPhysicsQuality Management

Project Portfolio

Data Scientist Virtual Internship
End-to-End Credit Risk Modeling and Dashboarding for ID/X Partners logo

End-to-End Credit Risk Modeling and Dashboarding for ID/X Partners

Addressing a critical class imbalance in a 466k-row lending dataset, I engineered a leakage-free pipeline and deployed a Balanced Logistic Regression model to proactively identify high-risk borrowers. This end-to-end solution transformed raw data into actionable business intelligence, directly enabling stakeholders to mitigate potential financial losses through data-driven approval strategies.

Technical Highlights
Technologies
  • Python (Pandas, NumPy, Scikit-Learn, Matplotlib, Seaborn)
  • Power BI (DAX, Power Query)
  • VS Code
  • Git/GitHub
Methodologies
  • End-to-End CRISP-DM Lifecycle
  • Exploratory Data Analysis (EDA)
  • Data Leakage Prevention
  • Feature Engineering (One-Hot Encoding, Scaling)
  • Class Imbalance Handling (class_weight='balanced')
  • Model Interpretation (Coefficient Analysis)
Deliverables
  • Production-ready Jupyter Notebook (.ipynb) & Python Script (.py)
  • Serialized Model Artifacts (.pkl)
  • Interactive Power BI Dashboard (.pbix)
  • Executive Infographic Presentation (.pdf/.pptx)
  • Comprehensive Technical Documentation (README.md)
Impact & Insights
  • 1.

    Risk Detection: Increased Recall from 8% to 66%, successfully catching 2 out of 3 potential defaults that baseline models missed.

  • 2.

    Model Performance: Achieved a 220% improvement in F1-Score (0.14 → 0.45) by optimizing for the minority class rather than overall accuracy.

  • 3.

    Data Integrity: Reduced the dataset from 466k to 239k high-confidence records by rigorously removing data leakage and ambiguous "Current" loan statuses.

Business Intelligence Analyst Virtual Internship
End-to-End Business Intelligence Pipeline for Revenue Optimization: A Case Study of PT Sejahtera Bersama logo

End-to-End Business Intelligence Pipeline for Revenue Optimization: A Case Study of PT Sejahtera Bersama

Acting as a BI Analyst for PT Sejahtera Bersama, I engineered an end-to-end data pipeline in Google BigQuery to resolve fragmented sales data, enabling the identification of a critical "Volume vs. Value" paradox between high-frequency eBooks and high-revenue Robots. By visualizing these insights in Looker Studio, I formulated three strategic initiatives—including product bundling and regional replication—projected to unlock over IDR 175M in optimized revenue.

Technical Highlights
Technologies
  • Google Cloud Platform (BigQuery)
  • Standard SQL
  • GitHub (Version Control)
  • Google Looker Studio
Methodologies
  • Star Schema Data Modeling
  • ETL Pipeline Development (Extract-Transform-Load)
  • Primary Key & Relationship Mapping
  • KPI Dashboard Design
  • Strategic Business Analysis
Deliverables
  • Master Sales Data Table (master_sales_data)
  • Interactive Looker Studio Dashboard
  • SQL Transformation Scripts
  • Comprehensive Project Documentation (PDF)
  • Public GitHub Portfolio Repository
Impact & Insights
  • 1.

    Revenue Visibility: Uncovered IDR 175,475,057 in total sales and 11,654 units across 7 categories, revealing that Robots drive 44.3% of revenue while eBooks drive 32.7% of volume.

  • 2.

    Strategic Opportunity: Identified a specific cross-sell gap leading to a proposed "Robot Starter Kit" bundle designed to convert high-volume entry customers into high-value hardware buyers.

  • 3.

    Regional Optimization: Pinpointed Washington DC as a top performer (IDR 5.5M), providing a data-backed blueprint for replicating success in underperforming markets like Houston and Sacramento.

Big Data Analyst Virtual Internship
Kimia Farma Performance Analytics (2020–2023) logo

Kimia Farma Performance Analytics (2020–2023)

Architected an end-to-end analytics solution for Indonesia's largest pharmaceutical retailer by engineering a BigQuery ELT pipeline to unify 672K+ transactions and implementing complex tiered margin logic to resolve data silos. This initiative enabled real-time performance monitoring that identified a critical 30% geographic revenue dependency and pinpointed specific operational bottlenecks causing a 0.4-point customer satisfaction gap in key branches.

Technical Highlights
Technologies
  • Google Cloud Platform (BigQuery)
  • Standard SQL
  • GitHub
  • Google Looker Studio
  • Data Visualization (Scatter Plots, Choropleth Maps)
Methodologies
  • ELT Architecture
  • Dimensional Modeling
  • DRY Principle (via CTEs)
  • Tiered Business Logic Implementation
  • Gap Analysis
  • KPI Dashboarding
Deliverables
  • Unified analysis_table (14 columns)
  • Interactive Executive Dashboard
  • Comprehensive Data Dictionary
  • Technical Design Documentation
  • Strategic Recommendation Deck
Impact & Insights
  • 1.

    Uncovered Geographic Risk: Identified that Jawa Barat drives 29.5% of total revenue (102B IDR), highlighting a massive concentration risk versus emerging markets.

  • 2.

    Diagnosed Operational Gaps: Detected branches like Tarakan and Bekasi where high facility ratings (>4.4) masked poor transaction experiences (<3.99), enabling targeted audits.

  • 3.

    Quantified Profitability: Calculated 98.54B IDR in Nett Profit across 31 provinces using dynamic pricing tiers, revealing a 0.7% revenue stagnation trend in 2023.

Professional Experience

Project-Based Virtual Internship
ID/X Partners Logo

Data Scientist Project-Based Internship

This project-based internship focuses on the core concept of the end-to-end data science lifecycle, ranging from initial business understanding and data collection to automated model deployment. Participants leverage a technology stack including Python, R, and SQL to master specific skills in exploratory data analysis, feature engineering, machine learning modeling, and version control with Git.

Technical Skill

ML modelingCRISP-DMFeature EngineeringExploratory Data Analysis (EDA)Python (Pandas, Scikit-Learn)Power BI (DAX)Git/GitHub
View CertificateView Project
ID:352967IAPDGII1122025
Visit Official PageRakamin Logo
Project-Based Virtual Internship
Bank Muamalat Logo

Business Intelligence Analyst Project-Based Internship

This project-based internship provides Business Intelligence training focusing on technology stacks like SQL (PostgreSQL and BigQuery), Microsoft Excel, and Looker Studio. It cultivates specific skills in database management, advanced data manipulation (such as CTEs and joins), and data storytelling through interactive dashboards for banking portfolio analysis.

Technical Skill

Looker StudioKPI Dashboard DesignStrategic Business AnalysisGoogle BigQuery (SQL)ETL PipelineStar Schema ModelingGitHub
View CertificateView Project
ID:352967IAPDGIB29122025
Visit Official PageRakamin Logo
Project-Based Virtual Internship
Kimia Farma Logo

Big Data Analytics Project-Based Internship

This project-based internship provides hands-on experience in SQL querying, Google BigQuery data warehousing, and Looker Studio visualization to support analytical business needs. Participants acquire specific skills in ETL pipeline design, exploratory statistical analysis, and data storytelling to transform complex big data into actionable insights.

Technical Skill

Looker StudioGap AnalysisGoogle BigQuery (SQL)ELT ArchitectureDimensional ModelingTiered Business LogicGitHub
View CertificateView Project
ID:352967IAPDGIK3112025
Visit Official PageRakamin Logo
Professional Experience

Private Driver

Managed end-to-end operational logistics for private clients, ensuring seamless execution of daily schedules across diverse geographic regions. Operated as the primary point of contact for time-sensitive travel requirements, leveraging real-time data analysis to optimize routes and mitigate delays.

Key Achievements & Responsibilities

Operational Efficiency: Designed and executed dynamic route optimization strategies using real-time traffic data and predictive scheduling, ensuring 100% punctuality for critical business appointments.

Crisis Management & Adaptability: Successfully navigated complex logistical challenges (e.g., vehicle maintenance emergencies, sudden schedule pivots, and adverse weather conditions), maintaining operational continuity without disrupting client workflows.

Asset Management: Maintained strict adherence to safety protocols and vehicle performance metrics, conducting proactive preventative maintenance to ensure zero downtime and optimal vehicle reliability.

Technical Skill

Logistics ManagementRoute OptimizationPredictive SchedulingCrisis ManagementAsset ManagementOperational Efficiency