Fix Haskell project setup and implement Mermaid processing

- Update docster.cabal to modern format with GHC 9.12.2 compatibility
- Fix Mermaid code block detection using getDefaultExtensions
- Switch from SVG to PNG output for PDF compatibility
- Add CLAUDE.md with development instructions
- Update tooling from mise to ghcup for Haskell management

Known issues:
- Generated diagram files are created in root directory instead of alongside source files
- PDF generation fails with LaTeX errors for complex documents (missing \tightlist support)
- HTML output lacks proper DOCTYPE (quirks mode)
- Debug output still present in code

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Willem van den Ende 2025-07-29 17:23:04 +02:00
parent 68507e85d1
commit e248395557
6 changed files with 173 additions and 27 deletions

78
CLAUDE.md Normal file
View File

@ -0,0 +1,78 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
Docster is a Haskell CLI tool that converts Markdown files with embedded Mermaid diagrams into PDF or HTML. It processes Mermaid code blocks by rendering them to SVG images and embedding them in the output.
## Development Commands
### Build
```bash
cabal build
```
### Run
```bash
# Convert to PDF
cabal run docster -- -pdf path/to/file.md
# Convert to HTML
cabal run docster -- -html path/to/file.md
```
### Test a single file
```bash
cabal run docster -- -pdf mermaid-to-svg/sample.md
```
### Clean build artifacts
```bash
cabal clean
```
### Interactive development
```bash
cabal repl
```
## Architecture
The application consists of:
1. **Main.hs** (app/Main.hs:1-69) - Entry point and core logic
- `processMermaid` (app/Main.hs:21-34) - Transforms Mermaid code blocks into SVG images
- `transformDoc` (app/Main.hs:36-37) - Walks the Pandoc AST to process all blocks
- `compileToPDF` (app/Main.hs:47-58) - Converts Markdown to PDF using LaTeX
- `compileToHTML` (app/Main.hs:60-68) - Converts Markdown to HTML
The tool uses Pandoc's AST transformation capabilities to:
1. Parse Markdown input
2. Find Mermaid code blocks
3. Generate SVG files using mermaid-cli (mmdc)
4. Replace code blocks with image references
5. Output final PDF/HTML
## Dependencies
External requirements:
- GHC 9.12.2 and Cabal 3.16 (install via ghcup)
- Pandoc library
- TeX Live (for PDF generation via pdflatex)
- Mermaid CLI (`npm install -g @mermaid-js/mermaid-cli`)
To install GHC and Cabal:
```bash
source ~/.ghcup/env && ghcup install ghc 9.12.2
source ~/.ghcup/env && ghcup install cabal 3.16.0.0
source ~/.ghcup/env && ghcup set ghc 9.12.2
```
## Common Issues
1. **Missing LaTeX packages**: PDF generation requires a LaTeX distribution. Install BasicTeX or TinyTeX and use tlmgr to add packages as needed.
2. **Mermaid CLI not found**: Ensure mmdc is in PATH after installing @mermaid-js/mermaid-cli globally.
3. **Type errors with Text vs String**: The codebase mixes Data.Text and String. Use T.pack/T.unpack for conversions.

View File

@ -15,3 +15,9 @@ Mermaid code blocks (```mermaid) will be rendered to SVG and embedded.
- Pandoc - Pandoc
- TeX Live (for PDF) - TeX Live (for PDF)
- Mermaid CLI (`npm install -g @mermaid-js/mermaid-cli`) - Mermaid CLI (`npm install -g @mermaid-js/mermaid-cli`)
### specific versions
source ~/.ghcup/env && ghcup install ghc 9.12.2
source ~/.ghcup/env && ghcup install cabal 3.16.0.0
source ~/.ghcup/env && ghcup install hls 2.11.0.0

View File

@ -6,33 +6,42 @@ import Text.Pandoc
import Text.Pandoc.Error import Text.Pandoc.Error
import Text.Pandoc.Class (runIOorExplode) import Text.Pandoc.Class (runIOorExplode)
import Text.Pandoc.PDF (makePDF) import Text.Pandoc.PDF (makePDF)
import Text.Pandoc.Walk (walkM)
import Text.Pandoc.Extensions (Extension(..), enableExtension, getDefaultExtensions)
import System.Environment (getArgs) import System.Environment (getArgs)
import System.FilePath (replaceExtension) import System.FilePath (replaceExtension)
import System.Process (callProcess) import System.Process (callProcess)
import System.IO (writeFile)
import System.Directory (doesFileExist) import System.Directory (doesFileExist)
import Data.Text (Text) import Data.Text (Text)
import qualified Data.Text as T import qualified Data.Text as T
import qualified Data.Text.IO as TIO import qualified Data.Text.IO as TIO
import Data.Hashable (hash) import Data.Hashable (hash)
import Control.Monad (when, void) import Control.Monad (when, void)
import qualified Data.ByteString.Lazy as BL
-- Transform Mermaid code blocks into image embeds -- Transform Mermaid code blocks into image embeds
processMermaid :: Block -> IO Block processMermaid :: Block -> IO Block
processMermaid (CodeBlock (id', classes, _) contents) processMermaid block@(CodeBlock (id', classes, _) contents)
| "mermaid" `elem` classes = do | "mermaid" `elem` classes = do
let baseName = if null id' then "diagram-" ++ take 6 (show (abs (hash contents))) else id' putStrLn $ "🎯 Found Mermaid block with classes: " ++ show classes
let baseName = if T.null id' then "diagram-" ++ take 6 (show (abs (hash (T.unpack contents)))) else T.unpack id'
mmdFile = baseName ++ ".mmd" mmdFile = baseName ++ ".mmd"
svgFile = baseName ++ ".svg" pngFile = baseName ++ ".png"
writeFile mmdFile contents putStrLn $ "📝 Writing to " ++ mmdFile ++ " and generating " ++ pngFile
exists <- doesFileExist svgFile writeFile mmdFile (T.unpack contents)
when (not exists) $ void $ callProcess "mmdc" ["-i", mmdFile, "-o", svgFile] void $ callProcess "mmdc" ["-i", mmdFile, "-o", pngFile]
putStrLn $ "✅ Generated " ++ pngFile
return $ Para [Image nullAttr [] (T.pack svgFile, "Mermaid diagram")] return $ Para [Image nullAttr [] (T.pack pngFile, "Mermaid diagram")]
processMermaid x = return x processMermaid x = do
-- Debug: show what blocks we're processing
case x of
CodeBlock (_, classes, _) _ -> putStrLn $ "📄 Found code block with classes: " ++ show classes
_ -> return ()
return x
-- Walk the Pandoc AST and process blocks -- Walk the Pandoc AST and process blocks using walkM
transformDoc :: Pandoc -> IO Pandoc transformDoc :: Pandoc -> IO Pandoc
transformDoc = walkM processMermaid transformDoc = walkM processMermaid
@ -44,22 +53,51 @@ main = do
["-html", path] -> compileToHTML path ["-html", path] -> compileToHTML path
_ -> putStrLn "Usage: docster -pdf|-html <file.md>" _ -> putStrLn "Usage: docster -pdf|-html <file.md>"
pdfTemplate :: T.Text
pdfTemplate = T.unlines [
"\\documentclass{article}",
"\\usepackage[utf8]{inputenc}",
"\\usepackage{graphicx}",
"\\usepackage{geometry}",
"\\geometry{margin=1in}",
"\\usepackage{hyperref}",
"\\usepackage{enumitem}",
"\\providecommand{\\tightlist}{%",
" \\setlength{\\itemsep}{0pt}\\setlength{\\parskip}{0pt}}",
"\\title{$title$}",
"\\author{$author$}",
"\\date{$date$}",
"\\begin{document}",
"$if(title)$\\maketitle$endif$",
"$body$",
"\\end{document}"
]
compileToPDF :: FilePath -> IO () compileToPDF :: FilePath -> IO ()
compileToPDF path = do compileToPDF path = do
content <- TIO.readFile path content <- TIO.readFile path
pandoc <- runIOorExplode $ readMarkdown def content let readerOptions = def { readerExtensions = getDefaultExtensions "markdown" }
pandoc <- runIOorExplode $ readMarkdown readerOptions content
transformed <- transformDoc pandoc transformed <- transformDoc pandoc
let outputPath = replaceExtension path "pdf" let outputPath = replaceExtension path "pdf"
result <- runIOorExplode $ makePDF "pdflatex" [] transformed writerOptions = def
-- First generate LaTeX with proper document structure
latexOutput <- runIOorExplode $ writeLaTeX writerOptions transformed
let latexWithHeader = "\\documentclass{article}\n\\usepackage[utf8]{inputenc}\n\\usepackage{graphicx}\n\\usepackage{geometry}\n\\geometry{margin=1in}\n\\begin{document}\n" <> latexOutput <> "\n\\end{document}"
result <- runIOorExplode $ makePDF "pdflatex" [] (\_ _ -> return latexWithHeader) def transformed
case result of case result of
Left err -> error $ "PDF error: " ++ show err Left err -> error $ "PDF error: " ++ show err
Right bs -> writeFile outputPath bs >> putStrLn ("✅ PDF written to " ++ outputPath) Right bs -> BL.writeFile outputPath bs >> putStrLn ("✅ PDF written to " ++ outputPath)
compileToHTML :: FilePath -> IO () compileToHTML :: FilePath -> IO ()
compileToHTML path = do compileToHTML path = do
content <- TIO.readFile path content <- TIO.readFile path
pandoc <- runIOorExplode $ readMarkdown def content putStrLn $ "📖 Reading: " ++ path
putStrLn $ "📝 Content: " ++ T.unpack content
let readerOptions = def { readerExtensions = getDefaultExtensions "markdown" }
pandoc <- runIOorExplode $ readMarkdown readerOptions content
putStrLn $ "🔍 Parsed AST: " ++ show pandoc
transformed <- transformDoc pandoc transformed <- transformDoc pandoc
let outputPath = replaceExtension path "html" let outputPath = replaceExtension path "html"

View File

@ -1,19 +1,45 @@
cabal-version: >=1.24 cabal-version: 3.0
name: docster name: docster
version: 0.1.0.0 version: 0.1.0.0
synopsis: A self-contained CLI tool that converts Markdown with Mermaid diagrams to PDF/HTML
description: Docster converts Markdown documents containing Mermaid diagrams into PDF or HTML files
using Pandoc and Mermaid CLI. It automatically renders Mermaid code blocks to SVG
and embeds them in the output.
homepage: https://github.com/yourusername/docster
license: BSD-3-Clause
license-file: LICENSE
author: Your Name
maintainer: your.email@example.com
category: Text
build-type: Simple build-type: Simple
extra-doc-files: README.md
common warnings
ghc-options: -Wall
-Wcompat
-Widentities
-Wincomplete-record-updates
-Wincomplete-uni-patterns
-Wmissing-export-lists
-Wmissing-home-modules
-Wpartial-fields
-Wredundant-constraints
executable docster executable docster
import: warnings
main-is: Main.hs main-is: Main.hs
hs-source-dirs: app hs-source-dirs: app
build-depends: build-depends:
base >=4.14 && <5, base >=4.21 && <5,
text, text >=2.0 && <2.2,
filepath, filepath >=1.4 && <1.6,
directory, directory >=1.3 && <1.4,
process, process >=1.6 && <1.7,
hashable, hashable >=1.4 && <1.6,
pandoc, pandoc >=3.0 && <3.2,
pandoc-types, pandoc-types >=1.23 && <1.25,
pandoc-pdf bytestring >=0.11 && <0.13
default-language: Haskell2010 default-language: Haskell2010
ghc-options: -threaded
-rtsopts
-with-rtsopts=-N

View File

@ -1,6 +1,4 @@
[tools] [tools]
cabal = "3.16.0.0" # Node.js - required for mermaid-cli
# Node.js - required for Claude Code (18+) and frontend assets
node = "24.0.1" node = "24.0.1"