API Reference#
This page contains the complete API reference for pyrudof. The library provides Python bindings for performing RDF operations including validation, schema conversion, SPARQL queries, and data generation.
Core Classes#
Rudof#
- class pyrudof.Rudof#
Main interface for working with Semantic Web operations.
Provides a unified interface for:
Loading and serializing RDF and PG data.
Loading, checking, serializing and validating ShEx schemas.
Loading, serializing and validating SHACL shapes.
Loading, serializing and validating PGSchemas.
Loading, running and serializing SPARQL queries and query results.
Converting and comparing schemas between supported formats.
Loading and serializing DCTAP and Service Descriptions.
Generating synthetic data from schemas.
- classmethod __new__(*args, **kwargs)#
- __repr__()#
Return repr(self).
- add_external_resolver(spec)#
Registers an external-shape resolver from a spec string.
The spec follows the grammar
<kind>[:<arg>]. Built-in kinds (seelist_external_resolvers()):reject-all— reject any unhandled EXTERNAL shape.schema:<path>— substitute EXTERNAL declarations using a ShEx file.
Resolvers are prepended to the chain, so the most recently added is consulted first. Call this before
load_shex_schema()because the compiler reads the chain during AST→IR.- Parameters:
spec (
str) – Resolver spec string.- Raises:
RudofError – If the spec is malformed or the resolver cannot be built.
- clear_external_resolvers()#
Resets the external-shape resolver chain to the default (
reject-allonly).
- compare_schemas(schema1, schema2, mode1, mode2, format1, format2, base1, base2, label1, label2, reader_mode)#
Compares two schemas for structural equivalence.
Converts both schemas to Common Shapes Model and performs structural comparison.
- Parameters:
schema1 (
str) – First schema content.schema2 (
str) – Second schema content.mode1 (
str, optional) – First schema type. Defaults to “shex”.mode2 (
str, optional) – Second schema type. Defaults to “shex”.format1 (
str, optional) – First schema format. Defaults to “turtle”.format2 (
str, optional) – Second schema format. Defaults to “turtle”.base1 (
str, optional) – First base IRI. Defaults toNone.base2 (
str, optional) – Second base IRI. Defaults toNone.label1 (
str, optional) – First shape label. Defaults toNone.label2 (
str, optional) – Second shape label. Defaults toNone.reader_mode (
ReaderMode, optional) – Error handling. Defaults toReaderMode.Lax.
- Returns:
Comparison result showing differences.
- Return type:
ShaCo
- Raises:
RudofError – If either schema is malformed or comparison fails.
- convert_schemas(schema, input_mode, output_mode, input_format, output_format, base=None, reader_mode=None, shape=None, templates_folder=None, output_folder=None)#
- get_version()#
Alias for
version(). Returns the current Rudof version.- Returns:
Version string in semver format (e.g., “0.1.0”).
- Return type:
- list_endpoints()#
Lists known SPARQL endpoints.
- static list_external_resolvers()#
Returns the built-in external-shape resolver kinds.
- materialize(format=None, node=None)#
Materializes an RDF graph from the current ShEx schema and MapState.
Uses the Map semantic-action state (loaded via
read_map_state()or set after ShEx validation) to populate the triples defined by the ShEx schema’s Map extensions.- Parameters:
format (
ResultDataFormat, optional) – RDF serialization format for the output graph. Defaults toResultDataFormat.Turtle.node (
str, optional) – IRI string used as the root subject node of the materialized graph. A fresh blank node is minted when omitted.
- Returns:
Serialized RDF graph.
- Return type:
- Raises:
RudofError – If no ShEx schema or MapState is loaded, if the node IRI is invalid, or if materialization or serialization fails.
- node_info(node_selector, predicates=None, mode=None, show_colors=None, depth=None)#
Retrieves detailed information about a specific node in the RDF graph.
Provides a neighborhood view of a node, including its properties, outgoing and incoming edges, and connected nodes up to a specified depth.
- Parameters:
node_selector (
str) – Node identifier. Can be: - Full IRI:<http://example.org/alice>- Prefixed name::alice- Blank node:_:b1predicates (
List[str], optional) – Filter by specific predicates. Empty list means all predicates. Defaults to[].mode (
str, optional) – Node inspection mode. Can be: -"outgoing": Show only outgoing edges -"incoming": Show only incoming edges -"both": Show both outgoing and incoming edges Defaults to"both".show_colors (
bool, optional) – Use ANSI terminal colors in output. Defaults toTrue.depth (
int, optional) – Neighborhood distance (1=direct neighbors, 2=neighbors of neighbors, etc.). Defaults to1.
- Returns:
Formatted string with node information and neighborhood graph.
- Return type:
- Raises:
RudofError – If node selector is invalid or node doesn’t exist in the graph.
Note
Colors require a terminal with ANSI escape sequence support.
- read_data(input=None, format=None, base=None, reader_mode=None, merge=None, endpoint=None)#
Loads RDF data from a string, file path or URL. If a SPARQL endpoint is specified, it loads data from the endpoint instead.
- Parameters:
input (
str) – String, file path or URL to the RDF data. Defaults toNone. Examples:"data.ttl","http://example.org/data.rdf"format (
RDFFormat, optional) – Serialization format. Defaults toRDFFormat.Turtle. Available: Turtle, NTriples, RdfXml, TriG, N3, NQuads, JsonLdbase (
str, optional) – Base IRI for resolving relative IRIs. Defaults toNone.reader_mode (
&ReaderMode, optional) – Error handling strategy. Defaults toReaderMode.Lax. -Lax: Continue on errors (recommended for real-world data) -Strict: Fail on first errormerge – If
True, merge with existing data; ifFalse, replace current data. Defaults toFalse.
- Raises:
RudofError – If String/file/URL cannot be read or data is malformed (in Strict mode).
- read_dctap(input, format=None)#
Loads a ShapeMap from a string, file path or URL.
- Parameters:
input (
str) – String, file path or URL to the ShapeMap.format (
DCTapFormat, optional) – Data format. Defaults toDCTapFormat.Csv.
- Raises:
RudofError – If DCTAP data is malformed.
- read_map_state(path)#
Loads a MapState from a JSON file.
The MapState records the bindings produced by ShEx validation with Map semantic actions. It is required before calling
materialize().- Parameters:
path (
str) – Path to the JSON file containing the serialized MapState.- Raises:
RudofError – If the file cannot be read or the JSON is malformed.
- read_query(input, query_type=None)#
Loads a SPARQL query from a string, file path or URL.
- Parameters:
- Raises:
RudofError – If file/URL cannot be read or query is malformed.
- read_service_description(input, format=None, base=None, reader_mode=None)#
Loads a Service Description from a string, file path or URL.
- Parameters:
input (
str) – File path or URL to Service Description (RDF format).format (
RDFFormat, optional) – RDF format. Defaults toRDFFormat.Turtle.base (
str, optional) – Base IRI. Defaults toNone.reader_mode (
ReaderMode, optional) – Error handling. Defaults toReaderMode.Lax.
- Raises:
RudofError – If file/URL cannot be read or data is malformed.
- read_shacl(input=None, format=None, base=None, reader_mode=None)#
Loads a SHACL shapes graph from a file path or URL.
- Parameters:
input (
str) – String, file path or URL to the SHACL shapes. If not provided it extracts from the currently loaded data.format (
ShaclFormat, optional) – RDF format. Defaults toShaclFormat.Turtle.base (
str, optional) – Base IRI. Defaults toNone.reader_mode (
ReaderMode, optional) – Error handling. Defaults toReaderMode.Lax.
- Raises:
RudofError – If file/URL cannot be read or shapes are malformed.
- read_shapemap(input, format=None, base_nodes=None, base_shapes=None)#
Loads a ShapeMap from a string, file path or URL.
- Parameters:
input (
str) – String, file path or URL to the ShapeMap.format (
ShapeMapFormat, optional) – Format. Defaults toShapeMapFormat.Compact.base_nodes (
str, optional) – Base IRI for resolving node IRIs. Defaults toNone.base_shapes (
str, optional) – Base IRI for resolving shape IRIs. Defaults toNone.
- Raises:
RudofError – If file/URL cannot be read or ShapeMap is malformed.
- read_shex(input, format=None, base=None, reader_mode=None)#
Loads a ShEx schema from a file path or URL.
- Parameters:
input (
str) – String, File path or URL to the ShEx schema.format (
ShExFormat, optional) – Schema format. Defaults toShExFormat.ShExC.base (
str, optional) – Base IRI for resolving relative IRIs. Defaults toNone.reader_mode (
ReaderMode, optional) – Error handling mode. Defaults toReaderMode.Lax.
- Raises:
RudofError – If file/URL cannot be read or schema is malformed.
- reset_all()#
Resets all current state (data, schemas, queries, validation results).
This is equivalent to calling all individual reset methods. Use this to completely clean the Rudof instance.
- reset_data()#
Clears the current RDF data graph.
Removes all RDF triples from memory. Does not affect loaded schemas or other state.
- reset_query()#
Clears the current SPARQL query.
Removes the stored query from memory.
- reset_shacl()#
Clears the current SHACL shapes graph.
Unloads the SHACL schema from memory. Does not affect RDF data or other state.
- reset_shapemap()#
Clears the current ShapeMap.
Removes the ShapeMap used for ShEx validation.
- reset_shex()#
Clears the current ShEx schema.
Unloads the ShEx schema from memory. Does not affect RDF data or other state.
- reset_validation_results()#
Clears the current ShEx validation results
- run_query()#
Executes the loaded query against the loaded data.
- Raises:
RudofError – If query is malformed or execution fails.
- serialize_current_shex(shape_label=None, show_dependencies=None, show_statistics=None, show_schema=None, show_time=None, format=None)#
Serializes the current ShEx schema to a string.
- Parameters:
format (
ResultShExValidationFormat, optional) – Output format. Defaults toResultShExValidationFormat.Details.- Returns:
Serialized ShEx schema.
- Return type:
- Raises:
RudofError – If no schema is loaded or serialization fails.
- serialize_data(format=None)#
Serializes the current RDF data to a string.
- Parameters:
format (
ResultDataFormat, optional) – Output format. Defaults toResultDataFormat.Compact.- Returns:
Serialized RDF data.
- Return type:
- Raises:
RudofError – If serialization fails.
- serialize_dctap(format=None)#
Serializes the current DCTAP profile to a string.
- Parameters:
format (
ResultDCTapFormat, optional) – Output format. Defaults toResultDCTapFormat.Internal.- Returns:
Serialized DCTAP profile.
- Return type:
- Raises:
RudofError – If no DCTAP profile is loaded or serialization fails.
- serialize_query_results(format=None)#
Serializes the results of the last executed query to a string.
- Parameters:
format (
QueryResultFormat, optional) – Output format. Defaults toQueryResultFormat.Compact.- Returns:
Serialized query results.
- Return type:
- Raises:
RudofError – If serialization fails or if the resulting bytes cannot be converted
- serialize_service_description(format=None)#
Writes the current Service Description to a file.
- Parameters:
format (
ServiceDescriptionFormat, optional) – Format. Defaults toServiceDescriptionFormat.Internal.- Raises:
RudofError – If no description is loaded or file cannot be written.
- serialize_shacl(format=None)#
Serializes the current SHACL shapes graph to a string.
- Parameters:
format (
ShaclFormat, optional) – Output format. Defaults toShaclFormat.Turtle.- Returns:
Serialized SHACL shapes.
- Return type:
- Raises:
RudofError – If no shapes are loaded or serialization fails.
- serialize_shacl_validation_results(format=None, sort_mode=None)#
Serializes the results of the last SHACL validation operation to a string.
- Parameters:
format (
ResultShaclValidationFormat, optional) – Output format. Defaults toResultShaclValidationFormat.Details.sort_mode (
ShaclValidationSortMode, optional) – Sorting mode for validation results. Defaults toShaclValidationSortMode.Severity.
- Returns:
Serialized validation results.
- Return type:
- serialize_shapemap(format=None)#
Serializes the current ShapeMap to a string.
- Parameters:
format (
ShapeMapFormat, optional) – Output format. Defaults toShapeMapFormat.Compact.- Returns:
Serialized ShapeMap.
- Return type:
- Raises:
RudofError – If serialization fails or if the resulting bytes cannot be converted
into a valid UTF-8 string –
- serialize_shex_validation_results(format=None, sort_mode=None)#
Serializes the results of the last ShEx validation operation to a string.
- Parameters:
format (
ResultShExValidationFormat, optional) – Output format. Defaults toResultShExValidationFormat.Details.sort_mode (
PyShexValidationSortMode, optional) – Sorting mode for validation results. Defaults toPyShexValidationSortMode.Node.
- Returns:
Serialized validation results.
- Return type:
- update_config(config)#
Updates the configuration of this Rudof instance.
- Parameters:
config (
RudofConfig) – New configuration to apply.
Note
This does not affect already-loaded data or schemas, only future operations.
- validate_shacl(mode=None)#
Validates the current RDF data against the loaded SHACL shapes.
Performs comprehensive SHACL validation checking all constraints defined in the shapes graph.
- Parameters:
mode (
ShaclValidationMode, optional) – Validation engine. Defaults toShaclValidationMode.Native. -Native: Fast built-in engine (recommended) -Sparql: SPARQL-based engine (slower, for debugging)- Returns:
Detailed validation report with conformance status and violations.
- Return type:
ValidationReport
- Raises:
RudofError – If no data or schema is loaded, or validation fails.
Note
Native mode is recommended for production (faster)
SPARQL mode useful for debugging complex constraints
- validate_shex()#
Validates the current RDF data against the loaded ShEx schema using the current ShapeMap.
Performs ShEx validation by checking if nodes conform to their associated shapes as defined in the ShapeMap.
- Raises:
RudofError – If no schema, data, or ShapeMap is loaded.
RudofConfig#
- class pyrudof.RudofConfig#
Contains the configuration parameters for Rudof.
It can be: * Created with default values. * Loaded from a configuration file. * Used to create a new Rudof instance. * Used to update the configuration of an existing Rudof instance.
- classmethod __new__(*args, **kwargs)#
- __repr__()#
Return repr(self).
- static from_path(path)#
Loads a RudofConfig from a file path.
- Parameters:
path (
str) – Path to the configuration file.- Returns:
A configuration object initialized from the file.
- Return type:
PyRudofConfig
- Raises:
RudofError – If the file cannot be read or parsed.
RudofError#
Data Formats#
RDF Formats#
- class pyrudof.RDFFormat#
RDF data serialization formats supported when reading or writing graphs.
Supported RDF serialization formats:
RDFFormat.Turtle- Terse RDF Triple Language (.ttl)RDFFormat.NTriples- Line-based RDF format (.nt)RDFFormat.RdfXml- XML-based RDF syntax (.rdf, .owl)RDFFormat.TriG- Turtle with named graphs (.trig)RDFFormat.N3- Notation3 (.n3)RDFFormat.NQuads- N-Triples with named graphs (.nq)RDFFormat.JsonLd- JSON-LD format (.jsonld)
- JsonLd = RDFFormat.JsonLd#
- N3 = RDFFormat.N3#
- NQuads = RDFFormat.NQuads#
- NTriples = RDFFormat.NTriples#
- RdfXml = RDFFormat.RdfXml#
- TriG = RDFFormat.TriG#
- Turtle = RDFFormat.Turtle#
- classmethod __new__(*args, **kwargs)#
- class pyrudof.ResultDataFormat#
Output formats for serialized RDF data:
ResultDataFormat.Turtle- TurtleResultDataFormat.NTriples- N-TriplesResultDataFormat.RdfXml- RDF/XMLResultDataFormat.TriG- TriGResultDataFormat.N3- Notation3ResultDataFormat.NQuads- N-QuadsResultDataFormat.Compact- Compact representation (default)ResultDataFormat.Json- JSONResultDataFormat.PlantUML- PlantUML diagramResultDataFormat.Svg- SVG imageResultDataFormat.Png- PNG image
- Compact = ResultDataFormat.Compact#
- Json = ResultDataFormat.Json#
- JsonLd = ResultDataFormat.JsonLd#
- N3 = ResultDataFormat.N3#
- NQuads = ResultDataFormat.NQuads#
- NTriples = ResultDataFormat.NTriples#
- PlantUML = ResultDataFormat.PlantUML#
- Png = ResultDataFormat.Png#
- RdfXml = ResultDataFormat.RdfXml#
- Svg = ResultDataFormat.Svg#
- TriG = ResultDataFormat.TriG#
- Turtle = ResultDataFormat.Turtle#
- classmethod __new__(*args, **kwargs)#
ShEx Formats#
- class pyrudof.ShExFormat#
ShEx schema serialization formats.
Supported ShEx schema formats:
ShExFormat.ShExC- ShEx Compact Syntax (human-readable, .shex)ShExFormat.ShExJ- ShEx JSON format (.json)ShExFormat.Turtle- ShEx in RDF/Turtle (.ttl)
- ShExC = ShExFormat.ShExC#
- ShExJ = ShExFormat.ShExJ#
- Turtle = ShExFormat.Turtle#
- classmethod __new__(*args, **kwargs)#
- class pyrudof.ResultShexValidationFormat#
Output formats for ShEx validation results:
ResultShexValidationFormat.Details- Human-readable details (default)ResultShexValidationFormat.Turtle- TurtleResultShexValidationFormat.NTriples- N-TriplesResultShexValidationFormat.RdfXml- RDF/XMLResultShexValidationFormat.TriG- TriGResultShexValidationFormat.N3- Notation3ResultShexValidationFormat.NQuads- N-QuadsResultShexValidationFormat.Compact- CompactResultShexValidationFormat.Json- JSONResultShexValidationFormat.Csv- CSV
- Compact = ResultShexValidationFormat.Compact#
- Csv = ResultShexValidationFormat.Csv#
- Details = ResultShexValidationFormat.Details#
- Json = ResultShexValidationFormat.Json#
- N3 = ResultShexValidationFormat.N3#
- NQuads = ResultShexValidationFormat.NQuads#
- NTriples = ResultShexValidationFormat.NTriples#
- RdfXml = ResultShexValidationFormat.RdfXml#
- TriG = ResultShexValidationFormat.TriG#
- Turtle = ResultShexValidationFormat.Turtle#
- classmethod __new__(*args, **kwargs)#
SHACL Formats#
- class pyrudof.ShaclFormat#
SHACL shapes graph serialization formats.
SHACL shapes graph serialization formats (all RDF-based):
ShaclFormat.Turtle- Turtle format (.ttl)ShaclFormat.NTriples- N-Triples format (.nt)ShaclFormat.RdfXml- RDF/XML format (.rdf)ShaclFormat.TriG- TriG format (.trig)ShaclFormat.N3- Notation3 format (.n3)ShaclFormat.NQuads- N-Quads format (.nq)
- N3 = ShaclFormat.N3#
- NQuads = ShaclFormat.NQuads#
- NTriples = ShaclFormat.NTriples#
- RdfXml = ShaclFormat.RdfXml#
- TriG = ShaclFormat.TriG#
- Turtle = ShaclFormat.Turtle#
- classmethod __new__(*args, **kwargs)#
ShapeMap Formats#
- class pyrudof.ShapeMapFormat#
ShapeMap serialization formats.
ShapeMap serialization formats:
ShapeMapFormat.Compact- Compact ShapeMap syntax (human-readable)ShapeMapFormat.Json- JSON representation
- Compact = ShapeMapFormat.Compact#
- Json = ShapeMapFormat.Json#
- classmethod __new__(*args, **kwargs)#
Other Formats#
- class pyrudof.DCTapFormat#
DCTAP input formats.
DCTAP (Dublin Core Tabular Application Profiles) formats:
DCTapFormat.Csv- Comma-separated values (.csv)DCTapFormat.Xlsx- Excel spreadsheet (.xlsx)
- Csv = DCTapFormat.Csv#
- Xlsx = DCTapFormat.Xlsx#
- classmethod __new__(*args, **kwargs)#
- class pyrudof.QueryResultFormat#
Output formats for SPARQL CONSTRUCT query results.
SPARQL query result formats:
QueryResultFormat.Turtle- Turtle format (.ttl)QueryResultFormat.NTriples- N-Triples format (.nt)QueryResultFormat.RdfXml- RDF/XML format (.rdf)QueryResultFormat.TriG- TriG format (.trig)QueryResultFormat.N3- Notation3 format (.n3)QueryResultFormat.NQuads- N-Quads format (.nq)QueryResultFormat.Csv- CSV table format (.csv)
- Csv = QueryResultFormat.Csv#
- N3 = QueryResultFormat.N3#
- NQuads = QueryResultFormat.NQuads#
- NTriples = QueryResultFormat.NTriples#
- RdfXml = QueryResultFormat.RdfXml#
- TriG = QueryResultFormat.TriG#
- Turtle = QueryResultFormat.Turtle#
- classmethod __new__(*args, **kwargs)#
- class pyrudof.QueryType#
SPARQL query type:
QueryType.Select- SELECT queryQueryType.Construct- CONSTRUCT queryQueryType.Ask- ASK queryQueryType.Describe- DESCRIBE query
- Ask = QueryType.Ask#
- Construct = QueryType.Construct#
- Describe = QueryType.Describe#
- Select = QueryType.Select#
- classmethod __new__(*args, **kwargs)#
- class pyrudof.ServiceDescriptionFormat#
Service Description serialization format.
SPARQL Service Description formats:
ServiceDescriptionFormat.Internal- Internal representationServiceDescriptionFormat.Json- JSON formatServiceDescriptionFormat.Mie- MIE specification format
- Internal = ServiceDescriptionFormat.Internal#
- Json = ServiceDescriptionFormat.Json#
- Mie = ServiceDescriptionFormat.Mie#
- classmethod __new__(*args, **kwargs)#
Reader Configuration#
- class pyrudof.ReaderMode#
Declares the reader mode used when parsing RDF data.
The reader mode controls how strictly parsers react to syntax errors and other issues in the input stream (files, URLs, strings).
Controls error handling during parsing:
ReaderMode.Lax- Ignore non-fatal errors and continue (default, recommended for real-world data)ReaderMode.Strict- Fail immediately on first error (useful for strict validation)
- Lax = ReaderMode.Lax#
- Strict = ReaderMode.Strict#
- classmethod __new__(*args, **kwargs)#
Validation#
SHACL Validation#
- class pyrudof.ShaclValidationMode#
SHACL validation engine.
SHACL validation engines:
ShaclValidationMode.Native- Native SHACL validation engine (faster, recommended)ShaclValidationMode.Sparql- SPARQL-based validation (slower, useful for debugging)
- Native = ShaclValidationMode.Native#
- Sparql = ShaclValidationMode.Sparql#
- classmethod __new__(*args, **kwargs)#
- class pyrudof.ShapesGraphSource#
Source of the SHACL shapes graph used during validation.
Shapes can come from the current SHACL schema or be extracted from the current RDF data graph.
Source of SHACL shapes for validation:
ShapesGraphSource.CurrentData- Extract shapes from the current RDF data graphShapesGraphSource.CurrentSchema- Use the currently loaded SHACL schema
- CurrentData = ShapesGraphSource.CurrentData#
- CurrentSchema = ShapesGraphSource.CurrentSchema#
- classmethod __new__(*args, **kwargs)#
ShEx Validation#
Materialize#
The materialize operation generates an RDF graph by combining a ShEx schema
(which describes the graph structure via Map semantic actions) with a MapState
that supplies the concrete node values.
Workflow:
Load a ShEx schema with Map semantic actions using
Rudof.read_shex().Load the MapState (produced by running ShEx validation with Map extensions, or built manually as a JSON file) using
Rudof.read_map_state().Call
Rudof.materialize()to produce the serialized RDF graph.
MapState JSON format:
The MapState file is a JSON object that maps each Map-extension IRI key
(the code value in a SemAct of type http://shex.io/extensions/Map/)
to an RDF node value. IRI nodes use {"Iri": "<iri-string>"}:
{
"http://example.org/name": {"Iri": "http://example.org/Alice"},
"http://example.org/email": {"Iri": "mailto:alice@example.org"}
}
See the Examples page for full working examples.
Data Generation#
For the complete data generation API reference (GeneratorConfig,
DataGenerator, SchemaFormat, OutputFormat, CardinalityStrategy,
EntityDistribution, DataQuality), see Data Generation.