Database Cleansing is a process of identifying and modifying corrupt or erroneous characters or items from a record or database and finding incorrect, imperfect, wrong or extraneous parts of the data and then replacing, modifying or deleting the unclean data.
Wrong or unreliable data can lead to false conclusion and erroneous decision and can be costly to a business. Companies use customer information databases that record data like contact information, addresses, phone numbers etc. It is impossible for a company to obtain accurate business interpretation of data if the data is inaccurate or imperfect. Extraneous and incorrect addresses will cost the company money if the mail is returned or even losing the customer.
The Data Cleansing Process/Steps
- Data Importation – Import the uncleaned data from the various databases.
- Combine the datasets – Merge the differently formatted datasets into one dataset.
- Append or reconstruct Missing Data – Recreate missing data eg postcodes, State, Street Name, Street Type, Phone numbers, etc where possible.
- Standardize Data – Combine the data, separate or modify so that the same type of data exists in each field ie First Name, Last Name, Email Address, Phone Number etc are in appropriate fields.
- Normalize Data – Convert all similar data into a single format eg telephone numbers are converted in to (03) 9999 9999 etc.
- De-duplication – Identify duplicates, check manually and update.
- Verify data and enhance – Validate the data against data from other sources eg NCOA, Datawash Services and update to enhance data quality.
- Export Data – Export data into the required format eg CVS, Excel, XML etc.
Data Cleansing helps you by:
- saving you time in merging data before each use;
- reducing your marketing campaign costs as a result of reduction in duplication and returned mail etc;
- increasing your data reliability
- improving your companies image.