In the real world, data sets are very large, but it can be difficult to collect real-world data, at least at the beginning of a project.
To create larger sets of test data, we use the Python NumPy module, which comes with many ways to create random data sets, of any size.