Developing a Testable Batch Spark Application

Introduction In my experience, developing testable Spark applications code is not an easy task for data practitioners. I am not going to discuss the underlying reasons. In this post, I present my reasoning while developing a testable batch Spark Application. The text is presented in two sections. The first section , TDD - Developing code from the tests, I show an example of how to develop a code that is modular, readable, comprehensive, testable, and easy to maintain. In the last section More than producing pretty code - it’s about building organizational knowledge, I emphasize the benefits of using TDD in data projects based on my experience and on other sources that may help you to understand this methodology. ...

December 29, 2024 Â· Leandro Kellermann de Oliveira