Content collections are the best way to manage content in any Astro project. Using the Content layer API we can define a collection to provide automatic TypeScript type-safety for all of our content.
Let’s say we want to build our personal blog using Astro; using a collection we can turn a bunch of Markdown files (or MDX, Markdoc, YAML, or JSON files) stored locally in our project as the source of our blog content.
NOTE
This API allows us to load local content (aka files in our filesystem), but there are also third-party loaders (our create our own custom loader) to fetch the content from remote sources (think headless CMS).
To define a collection, we must create a src/content.config.ts file in your project, which Astro will use to configure your content collections (we can define many collections).
This file used to go in src/content/config.ts (the old location, Astro API moves fast), but apparently, now it has to be placed in src/content.config.ts. The glob function allows us to do two things:
Set the folder for our collection, using base (relative to project root).
Exclude some folder/files from being parsed, using the pattern option (!_**/*.md won’t match folders starting with un underscore)
NOTE
We are using TypeScript file (*.ts) to configure our collection, but it’s also possible to use JavaScript (with the .js extension) to define our collection; or even a Michael Jackson file (.mjs).
Pagefind is a search library for static websites. It works by indexing the built static files of our site, and adding a JavaScript search API that allows the user to find content with code that runs client-side.
The official Pagefind site contains excellent and comprehensive instructions for installing Pagefind
Since we are in Node.js, we can take advantage of the wrapper package for this runtime, and run:
npx pagefind --site "public"
Where public is whatever folder your SSG outputs the generated site, but I’m gonna go ahead and add it to my project dependencies so that I can integrate it in my package.json scripts:
So far we need to use several commands to get Pagefind generating an index of the latest version our our site:
Build the site itself: npm run build
Generating the Pagefind index: pagefind --site dist
It’s probably a good idea to combine these two commands, so that when we deploy our site, whatever pipeline runs the npm run build, will also generate the Pagefind stuff. So let’s update the build script in our package.json file:
"scripts": {
"build":"astro build && pagefind --site dist"
}
Running npm run build now, would produce the usual output, plus the following:
[...]
dist/pagefind
[Building search indexes]
Total:
Indexed 1 language
Indexed 45 pages
Indexed 1285 words
Indexed 0 filters
Indexed 0 sorts
If we open the dist folder (where Astro outputs the build), we should see a pagefind folder with our indexed site.
CAUTION
In my case, that didn’t work due to the way my github workflow was written:
As you can see, it runs the astro command directly, and not our npm run build. So if you don’t want to mess with your pipeline (no biggie, but hey), you can add the following to your build script:
That way, after every build, we end up with the pagefind folder in our public directory, and our GitHub pages deployment won’t have trouble finding it.
Bottom line, you can do it your way, but the whole setup needs the pagefind folder available in public folder.
IMPORTANT
Don’t forget to remove pagefind from your .gitignore file.
Assuming that everytime we build our site, Pagefind indexes its content, we need a UI to search and see the results. Pagefind now recommends the Component UI, which gives us declarative web components for common search interfaces.
NOTE
The previous UI, based on pagefind-ui.js and new PagefindUI(...), still exists in the docs, but the Component UI is the recommended path now.
The nice thing about the new system is that the components follow WAI-ARIA best practices, handle visible and assistive text, and are designed to be hard to misuse.
For images in search results, I provide explicit Pagefind metadata in the page head:
<meta
data-pagefind-meta="image[content]"
content={post.data.coverImage.cropped.src}
/>
<meta
data-pagefind-meta="image_alt[content]"
content={post.data.coverImage.alt}
/>
One thing to keep in mind is that Pagefind indexes the built site. If images look broken while running the dev server, test with the production build instead:
My search results showed the site title instead of the post title. The reason was subtle: Pagefind’s automatic title metadata comes from the first h1 on the page.
My logo used to be an h1:
<h1class="anaglyph">生活のバランス</h1>
So Pagefind treated the logo as the page title. The fix was to make the logo a non-heading element:
<spanclass="anaglyph">生活のバランス</span>
Then I restored the old visual size with CSS:
.anaglyph {
display:inline-block;
font-size:2em;
font-weight:900;
line-height:1.2;
}
Another good option is to explicitly mark the real page title for Pagefind:
One issue I ran into was related to Astro’s client-side navigation. The modal worked the first time, but after clicking a search result and navigating to another page, Cmd+K started behaving erratically.
The browser complained with something like:
Failed to execute 'showModal' on 'HTMLDialogElement':
The element is not in a Document.
The problem was that Astro swapped the page DOM, but Pagefind still had references to the old modal internally. So the next time I opened search, Pagefind sometimes tried to open a modal that was no longer in the document.
The fix was to clean up Pagefind before Astro swaps pages:
<script>
if (!document.documentElement.dataset.pagefindCleanupBound) {
if (!manager?.getInstanceNames ||!manager?.removeInstance) return
for (constnameof manager.getInstanceNames()) {
manager.removeInstance(name)
}
})
}
</script>
That closes any open Pagefind modal and clears Pagefind’s cached component instances before Astro swaps the page. The next page then creates fresh modal and trigger references.
Pagefind should scan the built HTML, find the data-pagefind-body elements, and generate the index. After that, the search trigger should open the modal, Cmd+K / Ctrl+K should work, and the results should show the correct post titles.