Class ParquetDataSource

java.lang.Object
com.isomorphic.base.Base
All Implemented Interfaces:
com.isomorphic.base.IAutoConfigurable, com.isomorphic.datasource.Committable, com.isomorphic.datasource.FreeResourcesHandler, com.isomorphic.datasource.IType, IToJSON, Serializable

public class ParquetDataSource extends BasicDataSource

A Parquet-backed DataSource implementation (serverType="parquet") that can:

  1. Serve Parquet data via SmartClient DSRequests (currently executeFetch(DSRequest) delegates to executeParquetFetch(DSRequest)).
  2. Generate SQL import artifacts from a Parquet file, including a SQL .ds.xml and (optionally) a SmartClient test-data file (.data.xml or JSON).

Parquet source location (dataURL)

This DataSource reads Parquet data from the dataURL property in the DataSource configuration. Supported forms:

  • webRoot-relative path (no scheme): examples/shared/data/myfile.parquet
  • file:// (treated as webRoot-relative per SmartClient example conventions): file:/examples/shared/data/myfile.parquet or file://examples/shared/data/myfile.parquet
  • https:// URL: https://example.com/myfile.parquet

Schema auto-derivation (autoDeriveSchema:true)

If autoDeriveSchema:true is provided and no explicit fields are defined (or if appendMissingColumns behavior is enabled), this class derives field definitions from the Parquet file schema referenced by dataURL. Field titles are humanized and primitive Parquet types are mapped to SmartClient field types. Group/complex types default to "text".

SQL artifact generation APIs

These APIs are intended for quickly generating SmartClient SQL DataSource artifacts that can be used to import or test Parquet data with SQLDataSource tooling.

  • getSQLDSXML() / getSQLDSXML(String, String, String): Generates a SQL .ds.xml (serverType="sql") as a String (no filesystem writes). This is useful when the caller wants to create a disposable SQL DS via DataSource.fromXML(xml).
  • createSQLDS() / createSQLDS(String, String, String, String, String, Integer): Writes a SQL <dsId>.ds.xml under webRoot and also writes a data file for test/import.
    Filesystem access required: these methods resolve webRoot from Config.getGlobal() and write output files under that directory.
  • copyData(String, CopyDataFormat, Integer, Integer, boolean) / copyData(String, String, String, CopyDataFormat, Integer, Integer, boolean): Copies Parquet rows to SmartClient test-data XML (.data.xml) or JSON. Supports streaming output and optional batching (split output into multiple files after N rows).
See Also:
  • Method Details

    • createSQLDS

      public com.isomorphic.datasource.ParquetDataSource.CreateSQLDSResult createSQLDS() throws Exception
      Convenience wrapper that reads dataURL from this DataSource's configuration (typically this.dsConfig.get("dataURL")) and then delegates to createSQLDS(String, String, String, String, String, Integer).

      Filesystem access required: this method writes generated files under webRoot.

      Returns:
      Map containing derived ids and output file paths (for example: dsId, tableName, recordTag, dsXmlPath, dataXmlPath, rowsWritten).
      Throws:
      Exception - on any read/write/parse error
    • createSQLDS

      public com.isomorphic.datasource.ParquetDataSource.CreateSQLDSResult createSQLDS(String dataURL, String explicitDsId, String explicitTableName, String dsOutputRelDir, String dataOutputRelDir, Integer maxRows) throws Exception
      Generates SmartClient SQL import artifacts from a Parquet file:
      • Writes <dsId>.ds.xml (serverType="sql")
      • Writes <recordTag>.data.xml containing rows read from Parquet

      Filesystem access required: resolves webRoot from global Config and writes output files under that directory. If webRoot is not set, this method throws an IllegalStateException.

      Parameters:
      dataURL - parquet location (required)
      explicitDsId - optional DS id (if null, derived from URL)
      explicitTableName - optional tableName (if null, defaults to dsId)
      dsOutputRelDir - optional output dir for .ds.xml (relative to webRoot)
      dataOutputRelDir - optional output dir for .data.xml (relative to webRoot)
      maxRows - optional row limit (null or <0 means all rows)
      Returns:
      Map containing derived ids and output file paths (dsXmlPath, dataXmlPath) and rowsWritten.
      Throws:
      Exception - on any read/write/parse error
    • getSQLDSXML

      public String getSQLDSXML() throws Exception
      Generates SQL DataSource XML (serverType="sql") for the Parquet file referenced by this DataSource instance (via dataURL in dsConfig).

      This method performs no filesystem writes. It reads the Parquet schema and returns the generated DataSource XML as a String so the caller can do DataSource.fromXML(xml) to create a disposable SQL DataSource.

      Conventions:

      • dsId defaults to this.dsName when present, otherwise derived from the Parquet filename
      • tableName defaults to dsId
      • testFileName is omitted by default (no test harness). If you want it, call the overload getSQLDSXML(String, String, String) and pass a value.
      Returns:
      DataSource XML as a String (serverType="sql")
      Throws:
      Exception - if dataURL is missing/invalid or Parquet schema cannot be read
    • getSQLDSXML

      public String getSQLDSXML(String dataURL, String explicitDsId, String explicitTableName) throws Exception
      Generates SQL DataSource XML (serverType="sql") for the given Parquet location.

      This method performs no filesystem writes. It reads the Parquet schema and returns the generated DataSource XML as a String.

      Parameters:
      dataURL - parquet location (required)
      explicitDsId - optional DS id override (if null, derived from URL)
      explicitTableName - optional tableName override (if null, defaults to dsId)
      Returns:
      DataSource XML as a String (serverType="sql")
      Throws:
      Exception - on any read/parse error