Five Golden Rules for Cleaning Data in Power BI
After getting data into Power BI from different sources, it is better to transform the dataset before loading it into the Power BI since it improves queries’ performance and saves the internal memory space of Power BI.
Here are five Golden rules for cleaning data efficiently in Power BI:
Dealing with Column headers
The first step of any cleaning process is to decide and indicate the columns that will be used in the data analysis as per the requirement.
Promote Headers and Remove Top Rows
It is not mandatory for every data source to have the right column names as their column headers. The column headers can be located in the first row of the dataset. Sometimes, they can be found after several empty rows. Hence, the first step is to make the right row the column headers. It can be done using the option
Use First Row as Headers.
Use First Row as Headers option makes the first row of the table the column headers.
When there are empty rows in the dataset before the column names row. Remove those rows manually till the column names row becomes the first row. Then, promote the first row to column headers. These empty rows can be removed using the option
Remove Top Rows . This option can be accessed by clicking on
Remove Rows located at the top pane of the application.
A dataset can contain non-user-friendly column names. For instance, let's consider that our dataset is sourced from a fitness band. The column names might contain abbreviations, and spelling errors, or there might not be proper spacing between words of the column name. A Power BI Analyst should take into consideration that the end-user who uses the reports or dashboards may not be from a technical background. So, it’s better if the column names do not contain any technical jargon.
# Correct Spelling error
Saels --> Sales# Add Space between words
Targetcountry --> Target Country