TIL: Parse Platform: '.include()' on queries results in N+1 queries
2022-06-23 00:00:00 +0000 UTCA project I support at work uses Parse Platform and MongoDB in its backend. Parse provides a NodeJS API for DB queries called ParseQuery
that includes a method ParseQuery.include(). This method “includes nested Parse.Objects for the provided key.”
That might sound a little weird given the appeal of Mongo – instead of foreign key relations & joins, Mongo supports nested documents. If you have an one-to-one relationship between an application user and some domain object, you can simply include that domain object as a nested document, just like a JSON subobject. Similarly, if you have a one-to-many relationship from one domain object to another, you can just include an array of subdocuments. Then instead of explicitly joining tables like you might in a normal-form relational database, you can instead explicitly exclude fields from returned documents using projections.
However the subdocument method for domain modeling might not always work out. Parse Platform provides a structured way to impose a relational system onto Mongo. ParseObjects
are the application-side abstraction for this ORM-like system. Each ParseObject
has its own collection in Mongo. A ParseObject
can include pointers (the term used by Parse for foreign keys) to objects in other collections. The .include(...)
call instructs Parse to “dereference” those pointers. Because Mongo is not a relational database, there are two options for this procedure:
- Perform a second Mongo
find()
query against the foreign collection using the pointers from the local document. - Rewrite our query as an aggregation pipeline using
$lookup
to perform the join.
Parse chooses option (1). This means each unique collection include()
’d will result in another trip to the database. While (2) would more closely approximate the behavior of a relational database that can pull data from a foreign table in a single query through join, I am not sure it is possible or better for the software to transparently perform the rewrite (though it is certainly possible for a developer to rewrite their code to use a single aggregation). Here is how the single ParseQuery
is transformed into multiple Mongo queries:
First, ParseQuery.include(...)
in the JS SDK:
include(...keys: Array<string | Array<string>>): ParseQuery {
keys.forEach(key => {
if (Array.isArray(key)) {
this._include = this._include.concat(key);
} else {
this._include.push(key);
}
});
return this;
}
When we call include()
in our query call chain, it just pushes the key into an internal array of included keys. So far so good. What does it do with this list? We can check ParseQuery.find(...)
to see. I won’t reproduce the whole thing here, but it does not mention includes. However we can see it passes on a JSON representation of itself to a QueryController
. The toJSON
method includes []this relevant bit](https://github.com/parse-community/Parse-SDK-JS/blob/2f2a6bdb37f30e767a56a97b26bdd622b99700ac/src/ParseQuery.js#L436-L438):
if (this._include.length) {
params.include = this._include.join(',');
}
Simple enough! We turn the list into a comma-separated string. So where does this get sent off to? We can see the default QueryController
at the bottom of ParseQuery.js:
const DefaultController = {
find(className: string, params: QueryJSON, options: RequestOptions): Promise<Array<ParseObject>> {
const RESTController = CoreManager.getRESTController();
return RESTController.request('GET', 'classes/' + className, params, options);
},
aggregate(className: string, params: any, options: RequestOptions): Promise<Array<mixed>> {
const RESTController = CoreManager.getRESTController();
return RESTController.request('GET', 'aggregate/' + className, params, options);
},
};
So eventually this will be processed on the server as a RestQuery. There are a few intermediate steps but we eventually end up in a call to a method named handleInclude which calls out to includePath where we can find this snippet:
const queryPromises = Object.keys(pointersHash).map(className => {
const objectIds = Array.from(pointersHash[className]);
let where;
if (objectIds.length === 1) {
where = { objectId: objectIds[0] };
} else {
where = { objectId: { $in: objectIds } };
}
var query = new RestQuery(config, auth, className, where, includeRestOptions);
return query.execute({ op: 'get' }).then(results => {
results.className = className;
return Promise.resolve(results);
});
});
In short, the code to this point has generated a hashmap called pointersHash
that indexes pointers (again, Parse’s name for foreign keys) by the class name (e.g. ParseObject
type and therefore DB collection name). Parse finally maps over the keys to this map and generates one RestQuery
per object we have included. These RestQuery
objects eventually each produce a Mongo find
query, taking their own trip to the database to produce a list of documents our original RestQuery
adds as sub-objects in its eventual response.
Bottom line? Be careful running Parse.Query
with .include()
s. Depending on the use case, you might end up with a lot more trips to the database than you expected!