A Singapore Government Agency Website
CSC
Learning At CSC Support
A A A
I am a...
Public Officer Government Affiliated Agency Officer Member of Public
Please note that CSC programmes are not applicable for member of public.
However, you are welcome to browse through our programmes for information.

Close
Would you like to login with SingPass or WOG-AD?

SingPass WOG-AD
;
  • Home
  • Programmes
  • Cleaning and Preparing Data for Analysis

Cleaning and Preparing Data for Analysis

About Outline
Domain:
Digitalisation; Infocomm Technology & Smart Systems
Content Type:
Classroom
Audience:
Middle Management; Manager; Senior Officer
Course Code:
CRDCL10
Remark:
You should have attended CSC's Data Analytics - Basic Principles and Applications (CRDDA10/CRDDAVL) programme, or are familiar with basic data analytics. After registration, you are required to fill in a pre-programme survey form in order for us to assess your suitability for the course.
Who Should Attend:
You are a Singapore Public Officer who has some basic understanding of data analytics and statistics, and you wish to prepare and process data effectively.
Programme Overview

Do you know that people often spend 80% of their time cleaning data and only about 20% doing the analysis?


Good, clean data is not always readily available. The datasets we first encounter usually contain large volumes of data, stored in formats not easy to use, or contain inaccurate, incomplete or unreasonable data. Sometimes, there is also the need to combine data from different sources, which can be a tricky and messy job.


In this programme*, you will use a range of hands-on activities and case studies to learn how to reverse the effort required so that you can spend your time analysing the data instead and improve your productivity.


*This programme is suitable for officers who need to clean data at work.



Learning Outcomes
  • Recognise the importance of structuring data in a tidy data format and applying the steps to convert ‘messy’ data to ‘tidy’ data for data analysis
  • Describe the dimensions of data quality – validity, accuracy, completeness, consistency, uniformity – and demonstrate the ability to explore the data quality before starting data analysis
  • Describe the data cleaning workflow of inspecting the data, cleaning the data and verifying the cleaning results
  • Give examples of tools to use for data cleaning and preparation
  • Demonstrate the ability to inspect a dataset to check the validity and profile the data using statistical summaries
  • Perform the steps of data cleaning on a dataset using add-ins/ tools available in Microsoft Excel
Last updated: 22 Jan 2025
Principles of tidy data 
  • Difference between tidy data and structured data
  • Importance of tidy data for data analysis
  • Converting ‘messy’ data to tidy data
Understanding data quality 
  • Validity – does the data conform to business rules and constraints?
  • Accuracy – does the data conform to standard or true values?
  • Completeness – to what extent is the data known / complete?
  • Consistency – is the data consistent within same dataset or across multiple datasets?
  • Uniformity – is the data specified using the same units of measure?
Iterative workflow / process of data cleaning 
  • Inspect the data to detect unexpected, incorrect or inconsistent data
  • Clean the data to remove or fix the anomalies
  • Verifying the results for correctness
  • Record and/or report the changes made
Inspecting data (hands-on) 
  • Analysing the validity of data (eg. data type, range constraints, mandatory and unique constraints, set-membership constraints, regular expression patterns etc.)
  • Data profiling using summary statistics
  • Data visualisation to find outliers
Cleaning data (hands-on) 
  • Remove irrelevant data
  • Remove duplicates
  • Transform data types
  • Handle syntax errors, especially with text data (eg. remove white spaces, fix typos)
  • Standardise data formats
  • Transform data by scaling and normalisation
  • Deal with missing values
  • Deal with outliers
9.00 hours
CSC
Session dates:
  • 06 Aug (09:00 - 17:00)
View all dates
$1,111.80 per participant
(including 9% GST of $91.80)
Apply Now Add to Watchlist Watching
Registration information:
  • Registration closes on 30 Jul 2025
  • First-come-first-serve basis.
  • Programme fee may vary on different financial year.

Need help? We are glad to help you

Find out how to contact us here.

  • 06 Aug (09:00 - 17:00)
Close

As you are accessing the programme registration page as a training coordinator, we would like to confirm if you submitting registration as a training coordinator on behalf of officer(s) from agency, or for yourself as a participant?

As a Training Coordinator As a Participant
Watchlist is full

This programme has not been added into the watchlist.
Please go to the profile page and remove some items from your watchlist.


Go to Profile Cancel
Loading
  • Learning
  • Programmes
  • Learning At CSC
  • Help
  • About CSC
  • Who We Are
  • Join Us
  • Rent A Facility
Newsletter
Get the latest programmes and knowledge on topics of your choice.
Subscribe
Contact Us Feedback Share Your Views @ REACH

Report Vulnerability Privacy policy Terms of use
Copyright © Civil Service College. All rights reserved.
This site is best viewed using IE9.039 & above and Chrome.