What does the term ‘data profiling’ mean? Data profiling is the process of examining data from available sources to confirm that it is suitable for analysis and highlight any quality issues. This can involve applying statistical tools, extracting metadata, checking ranges are as expected, checking consistency between datasets and identifying probable key values and relationships. AContinue reading “BCS L4 Data Analysis Exercise 4”
Category Archives: Uncategorized
BCS L4 Data Analysis Exercise 3
All programming languages utilise program constructs. In imperative languages they are used to control the order (flow) in which statements are executed (or not executed). Describe and give examples of the following programming constructs: Sequence (top down processing)Statements are executed in the order in which they appear in the code. e.g. the code: printf(”Hello,”);printf(“ World\n”);Continue reading “BCS L4 Data Analysis Exercise 3”
BCS L4 Data Analysis Tools Exercise 2
Data can be filtered or refined to ensure that only the relevant data is integrated. Describe a situation where data has been filtered to underpin a business objective.A common technique in marketing is A/B testing where different customers are randomly given different offers on a product, e.g. one group are offered a discount whilst anotherContinue reading “BCS L4 Data Analysis Tools Exercise 2”
BCS level 4 Data Analysis Apprenticeship Tools Exercise 1
Exercise 1 What is the purpose of data integration?Data integration reduces the complexity of data by unifying multiple datasets to create a single view of the data. How does data integration: improve the speed of analysing data?Integration may reduce the total volume of data to be processed by pre-excluding data not needed for the analysisContinue reading “BCS level 4 Data Analysis Apprenticeship Tools Exercise 1”
PowerBI, what if you have multiple date fields
In Microsoft Power BI (or Power Pivot in Excel) if you want to work with dates then you will usually need a Date table. Date sliders and the date based DAX functions rely on there being a date table linked to fields you want to work with. But what if your data has multiple dateContinue reading “PowerBI, what if you have multiple date fields”
Excel vs ISO date format
Our team manager has decided that we should use Trello to manage our work on a project. There does not seem to be any out of the box MI reporting in Trello, not the free version at least, so he asked me to look at if we could report using Excel ‘or something’. You canContinue reading “Excel vs ISO date format”
Exercises for BCS L4 Apprenticeship in Data Analysis
I’m doing a Level 4 qualification in data analysis and the training providor recently sent me a set of exercises to help prepare for an exam I’ll have to take at some point. The idea is to web research the questions then fill in the answers. This is the first batch. Exercise 1a What isContinue reading “Exercises for BCS L4 Apprenticeship in Data Analysis”
Disappointed in Course Materials
As part of my apprenticeship I’ve been provided with access to course matertials on UCertify. I’m seriously worried at the poor quality of the materials. Lots of typos, grammer issues and spelling issues, plus both the matertials and the quizes reference online technologies that have now been removed by the supplier (and one that mayContinue reading “Disappointed in Course Materials”
rmarkdown at BirminghamR
This months BirminghamR is on rmarkdown, basically lets you embed your R code in documents which can then be output in a variety of formats. There appears to be a cheatsheet.
Character Vectors and Factors
Dipping back in to “Statistics for Linguists” by Bodo Winter for a bit. Example given is gender <- c(‘F’, ‘M’, ‘F’, ‘F’), apparently you can use either single or double quotes. When you execute a call to gender to display it on the console then R responds with double quotes. You can use class(gender) toContinue reading “Character Vectors and Factors”