Self-Service end-to-end data pipeline development
eSimplicity builds data pipelines that connect to data sources of all types, ingests the data into a computing platform for processing, integrates and transforms data, applies machine learning (ML) and artificial intelligence (AI) algorithms, and prepares the data for high-speed consumption by business intelligence and data visualization tools. Data pipelines must also run batch, incremental, and streaming data flows. We combine historically separate data pipeline development components (batch vs streaming for example) into a single unified system for the user.
When we use Palantir for our projects, the pipeline development requires limited to no hand-coding and uses automation to hide the underlying complexity of distributed computing engines (e.g., Spark, Hadoop). The platform provides self-service graphical visualization to aid in the development of data pipelines by non-experts.
Data pipeline operationalization
eSimplicity standardizes the deployment processes and provides automatic promotion of data pipelines from development to test to production. Our data workflows include monitoring, error handling, ability to restart failed jobs, and fault notification.
Data pipeline orchestration
Orchestration is a complex technique that provides for coordination across multiple data pipelines that have been placed into operation. Orchestration allows for how multiple pipeline processes relate to one another and for coordination and management of multiple pipelines to avoid collisions and the potential for data flows to interfere with one another. eSimplicity orchestrates data pipelines that flow across on-premises and multiple cloud environments.
Users of data systems often need to share data sets, data pipelines and workflows with one another. The technical core of team development is a data catalog where all users, with proper access rights, can easily search for and find information about pre-existing data sources, data pipelines and workflows, and subsequently derive more value from their present investments. eSimplicity implements custom data catalog. When we use a Palantir platform, a complete data catalog with API and security is fully available out of the box.
Data and process governance
eSimplicity builds capabilities like data lineage to control who has access to sensitive data sets, mask data appropriately, and control who has access to data pipelines built using those data sets. We also track which users make changes to any aspect of a data pipeline, keeping an audit trail that can be reviewed. When we use a Palantir platform, this capability is available out of the box.
Automation depth and context sharing
Many people build out their data analytics pipelines by taking a best in class approach and automating each individual step in the process, one at a time. So they use one automation tool for ingesting data, a second for building transformation pipelines, a third for creating high performance views, and another for managing the operationalization of multiple data pipelines. They then stitch the components together themselves. eSimplicity’s concept is to not only have the depth of automation but to also pass the context of what is happening at each step in the process onto the next step automatically. So if something changes in the data ingestion process, the data transformation process is informed and can automatically adjust. This makes it possible to have data pipelines that are less brittle.