# Statistics – Large Data Sets

Using large data sets for teaching Statistics, a collection of resources including exam board-specific resources and also other data sets that could be used for teaching Statistics.

The subject content for Mathematics includes the use of data in Statistics and specifies that “Specifications should require students to explore the data set(s), and associated contexts, during their course of study to enable them to perform tasks that assume familiarity with the contexts, the main features of the data and the ways in which technology can help explore the data. Specifications should also require students to demonstrate the ability to analyse a subset or features of the data using a calculator with standard statistical functions.”

Calculators used must include the following features: an iterative function and the ability to compute summary statistics and access probabilities from standard statistical distributions.

From MEI – see Working with Large Data Sets, relevant for all the examination boards, including from The Advanced Mathematics Support Programme (AMSP) workbooks that show how to use Excel and Desmos to investigate the large data sets.

Again, from MEI, their freely available Introduction to Data Science course uses the Exam Boards’ large data sets and check this playlist on YouTube of the MEI large data set videos.

From CASIO, for the CG50 graphics calculator, resources for large data sets from all the exam boards.

Examination board-specific resources:

AQA: The large data set UK Department for Transport Stock Vehicle Database is found with their Assessment resources. See also Notes and guidance: Large data set. On AQA’s All About Maths see AQA Large Data Set – Guidance and Worksheets including a tool to help support teaching the statistics content of the specification. This is an amended version of the large data set spreadsheet, AQA’s large data set is included, but there are many additional sheets which allow students to explore the large data set. The tools are all provided for students, they do not need knowledge of Excel, meaning time can be spent on interpreting the data rather than learning about Excel.

SaveMyExams AQA A Level Large Data Set – Revision Notes

Pearson Edexcel: Weather data samples provided by the Met Office are included with the specification materials, see the Data Set Activities from the Teaching and Learning Materials under Teacher Support.

Edexcel’s Teaching and Learning Materials offers Statistics support.
Note the menu on the left (select show more) – direct link to teacher guides.

On CrashMATHS guidance notes are available for Edexcel’s large data set.

Resources on TES

Large Data Sets on Passion for Maths, very well-received introductory videos (Edexcel, OCR A and OCR B).

SaveMyExams Edexcel A Level Large Data Set – Revision Notes

From OCR comes a helpful document, written when the current specification was introduced, Teaching Statistics Using Large Data Sets which includes information and teaching suggestions. Information is provided on both OCR specification A and OCR (MEI) Specification B. The Data set for Specification A is under Pre-release materials on this page. Four sets of data: two each from the censuses of 2001 and 2011; two on method of travel to work and two showing the age structure of the population make up the large data set.

For OCR Specification A, under Teaching Activities, you will find large data set activities including starter activities, a sampling activity and a resource using the large data set to investigate whether usage of underground, metro, light rail and tram (UMLRT) has increased between 2001 and 2011.

TES ResourceOCR Large Data Set – a familiarisation task from J Forsythe

SaveMyExams OCR A Large Data Set Revision Notes

MEI’s approach is different in that three data sets are provided for teaching, these are be used in a three-yearly cycle.

MEI have a helpful resource page on Data Sets with data sets provided for teachers of statistics to use with their students, A series of three large data sets with notes for teachers have been made available through The Royal Statistical Society as part of its support for statistics education in schools.

MEI data sets are available on GeoGebra.

Further data set resources

From Hodder Education, see this page for a data set on Cycling accidents. This data set has 93 records on cycling accidents including for example distance from home and wearing a helmet y/n. The data is discussed in Hodder’s book for MEI A Level Maths but clearly it is also a very useful resource in its own right. We can introduce larger data sets for younger students too and a set such as this would work well.

For a collection of data sets for analysis, check Douglas Butler’s TSM Resources – Data. This well organised collection includes data sets, resources for teaching and also Excel Tutorials. The teaching resources include Making Statistics Vital from Jonny Griffiths, a resource I included on this page on RISPS from Jonny Griffiths. look at the this task on World Wide Statistics for example which includes the task with answers and a spreadsheet with data for 191 countries.

See this post on Regression which includes the use of GeoGebra for data analysis.

From Cambridge Mathematics, this post looks at using Excel to explore the variability of samples.

From WJEC, we have their KS5 resource, Exploring Large Data Sets which includes several data sets and very comprehensive Teacher notes.

The notes include many questions to consider and also clear instructions for Excel and GeoGebra.

On STEM Learning, you will find many excellent Statistics resources including this Descriptive Statistics resource. See Statistics a section of the A level mathematics resource packages.