This repository contains the code and documentation submitted in March 2025 for Data Wrangling assignment at the Singapore University of Social Sciences (SUSS). The assignment focuses on importing, transforming, and analysing survey data collected from users of public services provided by four organisations.
Language: SQL (MySQL)
Tools: MySQL Server, MySQL Workbench, Microsoft Word
Import survey data collected from 4,858 users across four public-funded organisations.
Clean and enhance the data:
-
Replace numerical codes with human-readable labels.
-
Convert date codes (days since 9 Aug 1965) into MySQL DATE format.
-
Optimise the table for efficient storage using appropriate ALTER TABLE statements.
-
Generate summary tables to uncover meaningful relationships in the dataset.
-
Explain insights through commentary based on SQL outputs.
To map the numeric values with their corresponding text labels according to the data dictionary, CASE statements were used in SQL query in Table Plus. The transformed columns are ‘urgency’ , ‘subsidy’, ‘gender’, ‘coordinated’, ‘similarity_to_ideal’ and ‘willingness_to_recommend’.
For the ‘urgency’ variable, when the value is 1, then the corresponding text label is P1, and so on and so forth.
For ‘subsidy’ variable, when the value is 1, then the corresponding text label is Y, so on and so forth.
For ‘coordinated’ variable, when the value is 1, then the corresponding text label is Never,
when value is 2, then Sometimes, so on and so forth.
11,
then the corresponding text label is ‘Unsure’ and ‘NA’ when the value is 999
In the query, “ELSE similarity_to_ideal” tells MySQL to keep 0-10 as is if the value is not 11 or 999.
1, then the corresponding text label is ‘Definitely No’
when value is 2, then ‘Probably No’, so on and so forth.
The numeric values in the identified columns have been mapped to their corresponding text labels and a new column named ‘caldate’ with MySQL DATE type ‘2022-04-01’ has been added.

The DESCRIBE TMA_data_labelled statement must be run separately from the ALTER TABLE query and the result is as follows:
Two summary tables are generated using GROUP BY queries to explore relationships, each table is accompanied by a brief interpretation in the main report.
## Summary table #1 – Coordination influenced Willingness to Recommend.
This table investigate the average similarity_to_ideal score among users of various age groups and by their genders.
Back to Projects main page
