Data Storage

Marcelle provides a flexible interface for creating data stores that provide a unified interface for storing data either on the client side (in memory or using web storage) or on a remote server. Some components rely on the definition of a data store – for instance, the Dataset component that needs to store instances, – however data collections can be created on the fly to store custom information when relevant. This is particularly useful to store some of the state of the application (for instance the model's parameters), a history of changes to the application, or custom session logs recording some of the user's interactions.

We use the Feathersopen in new window framework to facilitate the creation of collections of heterogeneous data. When data is stored on the client side, no configuration is necessary. For remote data persistence, a server application can be generated in minutes using Feather’s command-line interface, with a large range of database systems available. The flexible selection of the data store's location is advantageous for rapid prototyping, where data is stored on the client side or using a local server during development.

DataStore

The following factory function creates and returns a Marcelle data store:

dataStore(location: string): DataStore

The location argument can either be:

  • 'memory' (default): in this case the data is stored in memory, and does not persist after page refresh
  • 'localStorage': in this case the data is stored using the browser's web storage. It will persist after page refresh, but there is a limitation on the quantity of data that can be stored.
  • a URL indicating the location of the server. The server needs to be programmed with Feathers, as described below.

.connect()

async connect(): Promise<User>

Connect to the data store backend. If using a remote backend, the server must be running, otherwise an exception will be thrown. If the backend is configured with user authentication, this method will require the user to log in.

This method is automatically called by dependent components such as datasets and models.

.login()

async login(email: string, password: string): Promise<User>;

.logout()

async logout(): Promise<void>

.service()

service(name: string): Service<unknown>

Get the Feathers service instance with the given name. If the service does not exist yet, it will be automatically created. Note that the name of the service determines the name of the collection in the data store. It is important to choose name to avoid potential conflicts between collections.

The method returnsa Feathers Service instance, which API is documented on Feathers' wesiteopen in new window. The interface exposes find, get, create, update, patch and remove methods for manipulating the data.

.uploadAsset()

async uploadAsset(blob: Blob, filename = ''): Promise<string>

Upload an asset to a remote backend from a Blob, with an optional filename. The method returns the path of the file on the backend server.

.signup()

async signup(email: string, password: string): Promise<User>

Service

Data Services are instances of Feathers Services. For details, refer to Feather's documentationopen in new window. From Feathers:

Service methods are pre-defined CRUDopen in new window methods that your service object can implement (or that have already been implemented by one of the database adapters). Below is an example of a Feathers service using async/awaitopen in new window as a JavaScript classopen in new window:

class MyService {
  async find(params) {
    return [];
  }
  async get(id, params) {}
  async create(data, params) {}
  async update(id, data, params) {}
  async patch(id, data, params) {}
  async remove(id, params) {}
  setup(app, path) {}
}

app.use('/my-service', new MyService());

Service methods must use async/awaitopen in new window or return a Promiseopen in new window and have the following parameters:

  • id — The identifier for the resource. A resource is the data identified by a unique id.
  • data — The resource data.
  • params - Additional parameters for the method call (see Feathers Docsopen in new window)

.find()

Service<T>.find(params: Params): Promise<Paginated<T>>

Retrieves a list of all resources from the service. params.query can be used to filter and limit the returned data.

.get(id, params)

Service<T>.get(id: string, params: Params): Promise<T>

Retrieves a single resource with the given id from the service.

.create()

Service<T>.create(data: T, params: Params): Promise<T>

Creates a new resource with data. The method should return with the newly created data. data may also be an array.

.update()

Service<T>.update(id: string, data: T, params: Params): Promise<T>

Replaces the resource identified by id with data. The method should return with the complete, updated resource data. id can also be null when updating multiple records, with params.query containing the query criteria.

.patch()

Service<T>.patch(id: string, data: Partial<T>, params: Params): Promise<T>

Merges the existing data of the resource identified by id with the new data. id can also be null indicating that multiple resources should be patched with params.query containing the query criteria.

.remove()

Service<T>.remove(id: string, params: Params): Promise<T>

Removes the resource with id. The method should return with the removed data. id can also be null, which indicates the deletion of multiple resources, with params.query containing the query criteria.

ServiceIterable

A ServiceIterable is a lazy iterable data collection over a data store service. This allows to execute processing of the data of a service in a lazy manner, on demand. A Service Iterable can be created using the following factory function:

function iterableFromService<T>(service: Service<T>): ServiceIterable<T>;

This function returns an iterable, that has the following API:

export class ServiceIterable<T> extends LazyIterable<T> {
  skip(n: number): ServiceIterable<T>;
  take(n: number): ServiceIterable<T>;
  select(fields: string[]): ServiceIterable<T>;
  query(q: Params['query']): ServiceIterable<T>;
}

The LazyIterable class is documented in utilities. There are a few additional methods specific to data store services that are meant to improve performance. These methods relate to querying the datastore (which is backed by Feathers.js). They can be chained like other LazyIterable methods, but they must be used before other LazyIterable methods.

Example:

// We want to get service items that have the field 'team' equal to 'A'. We skip 3 items
// and keep the next 5, and select a few fields to be returned.
const myIterable = iterableFromService(store.service('things))
  .query({ team: 'A' })
  .skip(3)
  .take(5)
  .select(['_id', 'x', 'y', 'label']);
  // The iterable is created, but the processing won't be executed until we iterate on it:
for await (const x of myIterable) {
  console.log(x)
}
  // Or if we convert it to an array:
const myThings = await myIterable.toArray();

.skip()

skip(n: number): ServiceIterable<T>;

Skip n elements from the service

.take()

take(n: number): ServiceIterable<T>;

Take only n elements from the service

.select()

select(fields: string[]): ServiceIterable<T>;

Select the fields to be returned for each item in the service. This can be used to optimize bandwidth and speed, by limiting the query to only the necessary fields.

.query()

query(q: Params['query']): ServiceIterable<T>;

Query items from the service using Feather.js's query syntax. See Feathers docsopen in new window

Dataset

dataset<T extends Instance>(name: string, store?: DataStore): Dataset<T>

A Dataset component allowing for capturing instances from a stream, storing them in a local or remote data-store. Items of the datasets are called instances, and are JavaScript objects with arbitrary shape, although by convention the fields x, y and thumbnail are commonly used. When using TypeScript, it is possible to extend the specification of the Instance interface:

export interface Instance {
  id?: ObjectId; // Object identifier in the database
  x: any; // Typically, input data
  y: any; // Typically, output data (for supervised learning)
  thumbnail?: string; // Thumbnail used for display in components such as datasetBrowser
  [key: string]: any;
}

Example:

const store = dataStore('localStorage');
const trainingSet = dataset('TrainingSet', store);

$instances.subscribe(trainingSet.create);

Parameters

OptionTypeDescriptionRequired
namestringThe dataset name
storeDataStoreThe dataStore used to store the instances of the dataset.

Properties

NameTypeDescriptionHold
$countStream<number>Total number of instances in the dataset
$changesStream<DatasetChange[]>Stream of changes applied to the dataset. Changes can concern a number of modifications (creation, update, deletion, ...) at various levels (dataset, class, instance). The interface is described below

Where dataset changes have the following interface:

interface DatasetChange {
  level: 'instance' | 'dataset';
  type: 'created' | 'updated' | 'removed' | 'renamed';
  data?: unknown;
}

.clear()

async clear(): Promise<void>

Clear the dataset, removing all instances.

.create()

async create(instance: Instance<InputType, OutputType>, params?: FeathersParams): Promise<Instance<InputType, OutputType>>

Create an instance in the dataset

OptionTypeDescriptionRequired
instanceInstance<InputType, OutputType>The instance data
paramsFeathersParamsFeathers Query parameters. See Feathers docsopen in new window.

.download()

async download(): Promise<void>

Download the dataset as a unique json file.

.find()

async find(params?: FeathersParams): Promise<Paginated<Instance<InputType, OutputType>>>

Get instances from the dataset, optionally passing Feathers parameters. Results are paginated, using the same format as services.

OptionTypeDescriptionRequired
paramsFeathersParamsFeathers Query parameters. See Feathers docsopen in new window.

.get()

async get(id: ObjectId, params?: FeathersParams): Promise<Instance<InputType, OutputType>>

Get an instance from the dataset by ID, optionally passing Feathers parameters.

OptionTypeDescriptionRequired
idObjectIdThe instance's unique ID
paramsFeathersParamsFeathers Query parameters. See Feathers docsopen in new window.

.items()

items(): ServiceIterable<Instance<InputType, OutputType>>

Get a lazy service iterable to iterate over the dataset.

Example:

const instances = await dataset
  .items() // get iterable
  .query({ label: 'A' }) // query instances with label 'A'
  .select(['id', 'thumbnail']) // select the fields to return
  .toArray(); // convert to array

.patch()

patch(id: ObjectId, changes: Partial<Instance>, params?: FeathersParams): Promise<Instance>

Patch an instance in the dataset

OptionTypeDescriptionRequired
idObjectIdThe instance's unique ID
changesPartial<Instance>The instance data
paramsFeathersParamsFeathers Query parameters. See Feathers docsopen in new window.

.remove()

remove(id: ObjectId, params?: FeathersParams): Promise<Instance>

Remove an instance from the dataset

OptionTypeDescriptionRequired
idObjectIdThe instance's unique ID
paramsFeathersParamsFeathers Query parameters. See Feathers docsopen in new window.

.sift()

sift(query: Query = {}): void

Filter the contents of the dataset from a Feathers Queryopen in new window. Sifting a dataset enforces that instances respect a given query. This affects all interactions with the dataset and dependent components. Note that it is possible to create several instances of datasets with different sift filters, that point to the same data store service (effectively creating different views on a given data collection).

OptionTypeDescriptionRequired
queryQueryFeathers Query parameters. See Feathers docsopen in new window.

.update()

update(id: ObjectId, instance: Instance<InputType, OutputType>, params?: FeathersParams): Promise<Instance>

Update an instance in the dataset

OptionTypeDescriptionRequired
idObjectIdThe instance's unique ID
instanceInstance<InputType, OutputType>The instance data
paramsFeathersParamsFeathers Query parameters. See Feathers docsopen in new window.

.upload()

async upload(files: File[]): Promise<void>

Upload a dataset from files.

OptionTypeDescriptionRequired
filesFile[]Array of files of type File

Server-Side Storage

Marcelle provides a dedicated package for server-side data storage: @marcellejs/backendopen in new window. It can easily be integrated into existing Marcelle applications using the CLI, and only require minimal configuration for local use.

Marcelle backends are FeathersJSopen in new window applications, that provide persistent data storage with either NeDb or MongoDb.

Disclaimer

The backend package is under active development and is not yet stable. It is not production-ready.

Adding a backend to an existing application

A backend can be added to a Marcelle application using the CLI:

npx marcelle
yarn marcelle
pnpx marcelle

Select 'Manage the backend', then 'Configure a backend'. this will install @marcellejs/backend as a dependency to your project and create configuration files.

Two database systems are currently available for storing data:

The CLI will install @marcellejs/backend and store configuration files in backend/config.

To run the backend locally:

npm run backend

The backend API will be available on http://localhost:3030open in new window. From a Marcelle application, interacting with this backend can be done through data stores, by instanciating them with the server URL as location parameter:

const store = dataStore('http://localhost:3030');

Configuration

Backends can be configured through two JSON files located in the backend/config directory, for development of production. Please refer to Feather's documentationopen in new window for general information about Feathers configuration. In this section, we detail Marcelle-specific configuration only.

nametypedefaultdescription
hoststringlocalhostHost Name for development.
portnumber3030Port
databasenedb | mongodbnedbThe type of database to use. This is pre-configured when generated with the CLI.
nedbpath"../data"The local path to the folder where NeDb data should be stored
uploadspath"../uploads"The local path to the folder where file uploads should be stored
mongodburl"mongodb://localhost:27017/marcelle_backend"The URL of the MongoDB database used for development
gridfsbooleantrueWhether or not to upload files and assets to GridFS instead of the file system
whitelist.servicesstring[] | "*""*"The list of services that are allowed on the backend. "*" acts as a wildcard, allowing any service to be created from Marcelle applications
whitelist.assetsstring[] | "*"["jpg", "jpeg", "png", "wav"]The types of assets (file extensions) allowed for file upload on the server
paginate.defaultnumber100The default number of items per page for all requests
paginate.maxnumber1000The maximum number of items per page for all requests
authentication.enabledbooleanfalseWhether or not to enable authentication

Permissions

It is possible to specify the permissions for a particular project in the configuration file. The permissions field of the config file accepts a record associating the role name ("editor" by default) to an array of CASLopen in new window Rules.

The following example specifies a default set of permissions:

  "permissions": {
    "superadmin": [
      {"action": "manage", "subject": "all"}
    ],
    "admin": [
      { "action": "manage", "subject": "all", "conditions": { "userId": "${user._id}" } },
      { "action": "manage", "subject": "all", "conditions": { "public": "${true}" } },
      { "action": "manage", "subject": "users" }
    ],
    "editor": [
      { "action": "manage", "subject": "all", "conditions": { "userId": "${user._id}" } },
      { "action": "manage", "subject": "all", "conditions": { "public": "${true}" } },
      { "action": "read", "subject": "users" },
      { "action": "update", "subject": "users", "conditions": { "_id": "${user._id}" } },
      { "action": "delete", "subject": "users", "conditions": { "_id": "${user._id}" }, "inverted": "true" }
    ]
  }