Jens Doerpmund
Hitachi Vantara
Vice President, Software Architecture
Biography
Jens Doerpmund is responsible for architectural governance and the strategical technical direction of Lumada software products at Hitachi Vantara, which include advanced data and metadata management solutions as well as storage solutions across disparate data sources and environments (on-premises and cloud) with a focus on distributed query processing and ML/AI workloads. Before joining Hitachi Vantara, Jens was Chief Architect at SAP Labs where he was focusing on topics such as the integration of machine learning and big data solutions with SAP’s enterprise software. Other roles during his 11 years at SAP include Chief Architect in the SAP NetWeaver / Business Warehouse Implementation Group, Solution Management for the BusinessObjects organization, Product Manager for the development of a knowledge graph-based analytics solution. Jens has three patents in the area of graph analytics and provided numerous ML trainings to SAP employees all over the world. Prior to SAP, Jens spent 10 years at Hewlett-Packard Consulting, where he was leading a data warehousing consulting team. Jens studied computer science in Germany and the UK. He holds a Master’s degree in Computer Science with a focus on Artificial Intelligence from the University of Manchester, UK., where he also worked as a postgraduate researcher and AI instructor.
Talks and Events
Actionable, Facet-based, Property Graphs
In general, property graphs are very flexible since we can associate any number of properties with nodes and edges. To add more structure, nodes and/or edges are often typed (via a label). In that case, a labeled node (or edge) of a particular type is expected to have specific properties. This works fine if node types are well defined and remain relatively stable. But what if we want to define relationships between any kind of nodes (existing or future node types)? For instance, in a metadata graph, we may be interested in the data lineage between various node types (“entities”), but in reality it doesn’t matter whether the node type is a dataset, the input to (or output of) a machine learning model, a physical device or digital twin that provides real-time data, etc. To model data lineage, all nodes need to include a group of properties that we would refer to as a database schema, but the actual type of those nodes is irrelevant. In general, how nodes can be related to other nodes, or how any service can observe or interact with nodes in a graph merely depends on the shared groups of properties which are often referred to a aspects or facets. In our presentation, we provide numerous examples of the various benefits that graph models which are based on facets provide. In particular, we will focus on actionable property graphs that can be utilized for self-governing data management and various aspects of optimizations via a pattern that is very popular in game programming, namely Entity Component Systems (ECS). Instead of defining nodes of a particular type, nodes are merely modeled as UIDs and sets of facets (aspects, components) that are standardized and can be added dynamically. For a metadata graph, this could include the logical model (via schema and ontology facets), physical aspects (facets for data formats and locations), statistics and usage, governance (e.g. facets state details about the inclusion of personal identifiable information). In order to make a graph actionable, external processes (so-called systems) operate on (arbitrary nodes) that happen to include certain facets. One system would operate on nodes that contain a schema facet and ensure that data lineage is maintained and provides an impact analysis if changes are necessary. Another system continually monitors access restrictions for nodes that represent datasets and contain a facet that specifies personally identifiable information.
Other systems automate the data placement of datasets, but operate on nodes that include multiple facets (for data location, but also usage statistics, and PII). With this information, the system can find the optimal location of a dataset while taking usage and legal restrictions into consideration.
While the concept of facets or aspects is not new, the purpose of the presentation is to raise awareness for the benefits of facets – in particular we show how facets can help turning property graphs into “active” property graphs.
Track: Metadata
Session Topics:
- Active Metadata
- Entity Component Systems
- Data Modeling
- Ontologies
- Semantic Layer