Data Source Analysis
IA1:
For IA2 & IA3 data or a data source should have been provided.
File Format
When looking at the data you should try to understand the file format being used. It will probably be either CSV, JSON or XML. The data inside it will generally consist of text and numbers at least.
Do you need all the data or only part of it? Are there issues with the data that need to be resolved?
You should look at the provided file format and explain how you will use it and any issues that could be a problem.
Source/Reliability
You should evaluate the source of the data and how reliable it is. Do you trust it? Is it likely to be biased? Is it limited or incomplete?
Sample Data
Provide some sample data.
Password DB Example - Data Source Analysis
File Format
The file is a CSV file which means that the headings need to be ignored when being imported. Also, all the fields are text.
Source/Reliability
The data provided looks unreliable with one of the URLs not being formatted correctly. Validation should be used and it should be rejected as a URL. One of the usernames looks like it should be an email address but apart from providing a warning, it is possible to know what the format should be.
The passwords provided are also bad. It might be good to provide a generate or some kind of rating to encourage better passwords.
The dataset is also very small. If more data was collected then other people may want additional information that isn't shown in this data set.
Since the passwords are stored and we need to get them back we cannot use hashing. In the future it would be better to encrypt the entire database to make it more secure. SQLite does support encrypted database which might be a solution when encryption is added.
Sample Data
Name | Username | Password | URL | Comments |
Gmail | apple@gmail.com | 123abc | https://www.gmail.com/ | |
Hotmail | bannanahotmail.com | Password1 | Hot Mail | |
NetFlix | apple@gmail.com | 123abc | https://nextfix.com/ |