Those of you who are computer-oriented know that data analysis typically takes much less time and effort than data preparation. Furthermore, if you make a mistake in your data analysis, you can often just repeat the process, using different tools, or a fresh approach to your original question. As long as the data is prepared properly, you and your colleagues can re-analyze your data to your heart's content. Contrariwise, if your data is not prepared in a manner that supports sensible analysis, there's little you can do to extricate yourself from the situation. For this reason, data preparation is, in my experience, much more important than data analysis.
Furthermore, the best type of data preparation involves data simplification; transforming complex information into simple information.
Throughout my career, I've relied on simple open source utilities and short scripts to simplify my data, producing products that were self-explanatory, permanent, and that could be merged with other types of data. As it happens, data simplification is not simple. There's a set of skills and methods that must be mastered. Hence, my book.
Over the next few weeks, I will be blogging on topics selected from Data Simplification: Taming Information With Open Source Tools. I hope I can convince you that this is a book worth reading.
- Jules Berman
key words: data simplification, data analysis, data repurposing, simplifying data, taming data, data wrangling, computer science, information science, jules j berman