**[[Dataview Caching]]** is a method to (A) incorporate [[Dataview]] queries into [[Obsidian Publish]] and/or (B) ensure the results of Dataview queries are always 'resolved-as-markdown' within your vault. ^intro
1. [[#What problems does it solve]]
2. [[#Simplified version]]
3. [[#Optional complications]]
4. [[#Notes & acknowledgements]]
## What problems does it solve
>[!abstract] Dataview queries are not publishable
> As a community plugin, Dataview's live functionality is not deployed to sites using [[Obsidian Publish]]. This method is intended to help people simply, efficiently, and scalably publish any number of Dataview queries to an Obsidian Publish site (or any other markdown publisher).
>[!abstract] Dataview results are tied to Obsidian with Dataview
> Insights gained from Dataview are dependent upon on the continued use and availability of Obsidian and Dataview. Regularly materialising Dataview query results into markdown helps people maintain readable representations of these intents that are independent of specific tools (other than a text reader).
## Simplified version
You technically only need the [[Dataview]] plugin for this method, though the script below uses [[Templater]] for convenience and accessibility. Steps 1 & 3 only need to be done once per page. The script in step 2 does not need to be maintained once defined, just run as required.
>[!example] The minimalised concept is
> 1. Place each page's dataview query into a [[Property|property]]
> 2. Run a script that iterates through pages with this property, where for each:
> 1. [Re-]generate a cache page based on the current page's name (e.g. `page` > `page_cache`)
> 2. Use [[Dataview]]'s `dv.queryMarkdown()` to output as markdown onto this page
> 3. [[Transclude]] the cache page onto the main page (e.g. `![[page_cache]]`) in lieu of the live Dataview call
>[!example] The 60-second plug-n-play steps are
> 1. Install the [[Dataview]] and [[Templater]] community plugins (if you haven't already)
> 2. Copy the script below into a new Templater script (call it whatever)
> 3. Find a page with a Dataview query (e.g. `page`) and put the query into a page property called `query`
> 4. Run the script
> 5. Replace the original Dataview code block with an embed of this cache page (e.g. `![[page_cache]]`)
```js
<%*
const QUERY_PROPERTY = "query";
const CACHE_SUFFIX = "_cache";
let start = new Date();
let countNew = 0;
let countUpdated = 0;
let countFailed = 0;
console.log(`--- Dataview Caching Commencing @ ${start} ---`);
const dv = app.plugins.plugins["dataview"].api;
const scripts = dv.pages().where(p => p[QUERY_PROPERTY]);
for (let script of scripts) {
try {
let queryResult = await dv.queryMarkdown(script[QUERY_PROPERTY]);
let cacheResult = queryResult?.value || "";
let cacheFilename = `${script.file.name}${CACHE_SUFFIX}`;
if (!tp.file.find_tfile(cacheFilename)) {
await tp.file.create_new("", cacheFilename, false, app.vault.getAbstractFileByPath(script.file.folder));
console.log(`"${script.file.path}" >> created "${cacheFilename}"`);
countNew++;
}
let tfile = tp.file.find_tfile(cacheFilename);
await app.vault.modify(tfile, cacheResult);
console.log(`"${script.file.path}" >> updated "${tfile.path}"`);
countUpdated++;
} catch (error) {
console.log(`"${script.file.path}" >> UPDATE FAILED: ${error.message}`);
countFailed++;
}
}
let finish = new Date();
new Notice(`Dataview Caching\n\nPass: ${countUpdated} (${countNew} new)\nFail: ${countFailed}\nTime: ${finish - start}ms\n\nRefer to developer console for more information (Ctrl + Shift + I)`);
console.log(`--- Dataview Caching Completed @ ${finish} (${finish - start}ms) ---`);
%>
```
## Optional complications
- [[#Decoupling queries from pages]]
- [[#Using properties for granular script control]]
- [[#Dual-rendering (live and cache)]]
- [[#Caching only published pages (local vs publish)]]
- [[#Including an audit trail]]
- [[#Backlink interception]]
- [[#On-page replacement]]
This site uses my initial (and more convoluted) version from 2023 that accommodates some of my more wrinkly use cases (also as an experiment to learn some JS). The minimalised version above has been made retroactively 'with the benefit of hindsight' to hopefully [[Antoine de Saint-Exupery on Perfection|make this kind of functionality accessible to more people]].
Below is a mediocre attempt at modularising some of the 'complications' in my current solution. Feel free to pick and choose and riff of any of these as they apply to your own use case.
### Decoupling queries from pages
The simplified solution assumes the simple case of *"one query, per page, that query, this page"*. Instead of having the same page both ***embedding*** the query in its content and ***holding*** the query in its properties, you can just create a separate 'script page' to hold the property and embed that cache page, so now:
- You can embed many query results on one page
- You can embed one query result on many pages
- These 'script pages' are now an effective database of all your Dataview queries
- You can repurpose the body of these script pages to document the script itself
- You can reference and re-use the scripts in other places*
All the above are actually achievable without decoupling into script pages: This is just a practice to provide a cleaner split if you are (like me) obsessive over naming conventions to the point of mental discomfort.
** Now that the script is a property, you can re-use them elsewhere using `dv.execute()` like `dv.execute(dv.page("my_reusable_script").script_query)`. Note however this is back to a live call and will not work directly with the caching being discussed here.*
### Using properties for granular script control
You can add more granular control over each cache page by using additional properties on each script page and handling them in the generation script. Some examples:
- **`script_type`**: to indicate whether the provided script is `dataview` or `dataviewjs` or anything else you might want to handle in caching differently
- **`cache_folder`**: to choose what folder the cache page should be generated into
- **`cache_file`**: to choose what filename the cache page should have
### Dual-rendering (live and cache)
One downside of caching is that it can be too slow and cumbersome to use when working on volatile datasets, e.g. when you are using a Dataview-powered MOC to guide your curation of the very knowledge articles the MOC is about.
One way to 'get the best of both worlds' is to embed both (A) the live Dataview query AND (B) the cache page, then use a CSS snippet to conditionally hide the cache version (when viewing in app) or the live version (when viewing as a published site).
The following is an example of the dual-embed on my [[State]] page:
```html
<span class="dataview">`$=dv.execute(dv.page("State (script)").script_local)`</span>![[State (tpdv output)#Content|no-h app-hidden]]
```
And a CSS snippet like the below will conditionally hide one of them based on the view container. You can craft finer selectors, but the below should work with the above 'out of the box':
```css
.published-container .dataview,
.obsidian-app span[alt*="app-hidden"]
{
display: none !important;
}
```
### Caching only published pages (local vs publish)
Note that Dataview queries will return all pages that meet your defined criteria ***even if those pages are not also published***. This means you may publish broken links and ***unintentionally expose the existence of pages that you did not intend to publish***. What does this mean?
**If you are just using this method to back up your query results** - No worries
**If you are using this method to just publish** - Consider limiting your queries with a clause such as `WHERE publish=true` if you use that flag to manage publishing, `FROM "published"` if you only publish from a specific folder, or whatever other heuristic you use to manage the publishability state of your pages.
**If you are using this method to achieve both** - You may wish to have two query properties, one version to return all the results for your local backup, and another version for publishing. Your script will need to look at and generate both result sets, and you will be adding a similar 'dual-rendering' CSS wrap like the above.
If you combine this complication with the dual rendering above, you will have:
- A live view of all results when you are viewing the page from your Obsidian
- A publishable set of results limited to only other published pages on your site
### Including an audit trail
Since these cache pages are automatically-generated content, it is generally a good idea to leave some kind of audit trail, e.g. at least the date it was generated.
A simple solution is to add a frontmatter block when generating the cache page, and add your traceability information in there as properties. Alternatively (or in addition to this), you can surface the audit trail information on the page itself when generating the cache page.
Materialising information on the page itself will however require a change of tact around embedding the cache page, as the audit information will now appear as part of the full-page embed. Consider placing the actual query content into a heading (e.g. `# Content`) and then embedding only that section (e.g. `[[page_cache#Content]]`), which leaves you free to add any other information into the cache page.
### Backlink interception
Live Dataview results are not 'real' links to their result pages, i.e. they WON'T cause a backlink to be created from those pages, and they will not 'web up' your MOC on the graph view.
Dataview results on cache pages WILL however be hard-linked to their result pages. This may or may not be a desirable consequence of materialising your queries.
Desirable or not, the backlink will take users to this cache page and not the main page itself, i.e. it will link to the table on `page_cache.md` as opposed to the `page.md` that features the subject and context.
To help ameliorate this, consider adding a `cache_backlink: "[[page.md]]"` property to your suite of script properties, then adding a through-link to the main page, on the generated cache page, to help users "close the loop". You can see an example of this on [[State (tpdv output)]]. It's not super clean but it fulfilled the functionality for the experiment.
### On-page replacement
Instead of re-generating a separate cache page and then embedding it back onto the page, you can instead opt to target and replace a partial / placeholder on the page itself.
Benefits:
- There is only ever one page in the workflow (no page bloat & keeps content together)
- This avoids the previously-mentioned backlink interception occurring
Disadvantages:
- The script and page are inseparably coupled now, and are mixed with content, so are less reusable
- Continuously updates the main page, destroying modified date traceability (this is why I do not do this one)
- Misplaced anchors or scripting mistakes may overwrite unintended parts of your file
## Notes & acknowledgements
I initially developed this when I started playing around with Obsidian and [[Pub Wrap|HTML/CSS/JS]] at the end of 2023. But apart from [a post on Reddit](https://www.reddit.com/r/ObsidianMD/comments/1al10ep/sharing_some_styling_and_publish_solutions/) I was too lazy and distracted to do anything else about it.
Someone interested in picking up Obsidian recently asked me about it though, and subsequently told me it was too confusing. Despite my descent back to confusion in the last 80% of this article, it was honestly an attempt at simplifying the concept to draw more people to Obsidian and make the solutions it enables more accessible and scalable.
Some plugs and acknowledgements:
- [Obsidian](https://obsidian.md/) itself of course
- [Dataview](https://github.com/blacksmithgu/obsidian-dataview) community plugin by [Michael Brenan](https://github.com/blacksmithgu) and [all these contributors](https://github.com/blacksmithgu/obsidian-dataview/graphs/contributors)
- [Templater](https://github.com/SilentVoid13/Templater) community plugin by [SilentVoid](https://github.com/SilentVoid13) and [all these contributors](https://github.com/SilentVoid13/Templater/graphs/contributors)
- [Joschua](https://joschua.io/) did an extremely similar thing before me [here](https://joschua.io/posts/2023/09/01/obsidian-publish-dataview/) (and also waffles less than me)