From 7518a035b189e3ff3d27e8d4ea22de1575ca19e5 Mon Sep 17 00:00:00 2001
From: Edward Loveall <edward@edwardloveall.com>
Date: Sun, 23 Jan 2022 15:05:46 -0500
Subject: [PATCH] Proxy GitHub gists with rate limiting

Previously, GitHub gists were embedded. The gist url would be detected
in a paragraph and the page would render a script like:

```html
<script src="https://gist.github.com/user/gist_id.js"></script>
```

The script would then embed the gist on the page. However, gists contain
multiple files. It's technically possible to embed a single file in the
same way by appending a `file` query param:

```html
<script
src="https://gist.github.com/user/gist_id.js?file=foo.txt"></script>
```

I wanted to try and tackle proxying gists instead.

Overview
--------

At a high level the PageConverter kicks off the work of fetching and
storing the gist content, then sends that content down to the
`ParagraphConverter`. When a paragraph comes up that contains a gist
embed, it retrieves the previously fetched content. This allows all the
necessary content to be fetched up front so the minimum number of
requests need to be made.

Fetching Gists
--------------

There is now a `GithubClient` class that gets gist content from GitHub's
ReST API. The gist API response looks something like this (non-relevant
keys removed):

```json
{
  "files": {
    "file-one.txt": {
      "filename": "file-one.txt",
      "raw_url":
"https://gist.githubusercontent.com/<username>/<id>/raw/<file_id>/file-o
ne.txt",
      "content": "..."
    },
    "file-two.txt": {
      "filename": "file-two.txt",
      "raw_url":
"https://gist.githubusercontent.com/<username>/<id>/raw/<file_id>/file-t
wo.txt",
      "content": "..."
    }
  }
}
```

That response gets turned into a bunch of `GistFile` objects that are
then stored in a request-level `GistStore`. Crystal's JSON parsing does
not make it easy to parse json with arbitrary keys into objects. This is
because each key corresponds to an object property, like `property name
: String`. If Crystal doesn't know the keys ahead of time, there's no
way to know what methods to create.

That's a problem here because the key for each gist file is the unique
filename. Fortunately, the keys for each _file_ follows the same pattern
and are easy to parse into a `GistFile` object. To turn gist file JSON
into Crystal objects, the `GithubClient` turns the whole response into a
`JSON::Any` which is like a Hash. Then it extracts just the file data
objects and parses those into `GistFile` objects.

Those `GistFile` objects are then cached in a `GistStore` that is shared
for the page, which means one gist cache per request/article. `GistFile`
objects can be fetched out of the store by file, or if no file is
specified, it returns all files in the gist.

The GistFile is rendered as a link of the file's name to the file in
the gist on GitHub, and then a code block of the contents of the file.

In summary, the `PageConverter`:

* Scans the paragraphs for GitHub gists using `GistScanner`
* Requests their data from GitHub using the `GithubClient`
* Parses the response into `GistFile`s and populates the `GistStore`
* Passes that `GistStore` to the `ParagraphConverter` to use when
  constructing the page nodes

Caching
-------

GitHub limits API requests to 5000/hour with a valid api token and
60/hour without. 60 is pretty tight for the usage that scribe.rip gets,
but 5000 is reasonable most of the time. Not every article has an
embedded gist, but some articles have multiple gists. A viral article
(of which Scribe has seen two at the time of this commit) might receive
a little over 127k hits/day, which is an average of over 5300/hour. If
that article had a gist, Scribe would reach the API limit during parts
of the day with high traffic. If it had multiple gists, it would hit it
even more. However, average traffic is around 30k visits/day which would
be well under the limit, assuming average load.

To help not hit that limit, a `GistStore` holds all the `GistFile`
objects per gist. The logic in `GistScanner` is smart enough to only
return unique gist URLs so each gist is only requested once even if
multiple files from one gist exist in an article. This limits the number
of times Scribe hits the GitHub API.

If Scribe is rate-limited, instead of populating a `GistCache` the
`PageConverter` will create a `RateLimitedGistStore`. This is an object
that acts like the `GistStore` but returns `RateLimitedGistFile` objects
instead of `GistFile` objects. This allows Scribe to gracefully degrade
in the event of reaching the rate limit.

If rate-limiting becomes a regular problem, Scribe could also be
reworked to fallback to the embedded gists again.

API Credentials
---------------

API credentials are in the form of a GitHub username and a personal
access token attached to that username. To get a token, visit
https://github.com/settings/tokens and create a new token. The only
permission it needs is `gist`.

This token is set via the `GITHUB_PERSONAL_ACCESS_TOKEN` environment
variable. The username also needs to be set via `GITHUB_USERNAME`. When
developing locally, these can both be set in the .env file.
Authentication is probably not necessary locally, but it's there if you
want to test. If either token is missing, unauthenticated requests are
made.

Rendering
---------

The node tree itself holds a `GithubGist` object. It has a reference to
the `GistStore` and the original gist URL. When it renders the page
requests the gist's `files`. The gist ID and optional file are detected,
and then used to request the file(s) from the `GistStore`. Gists render
as a list of each files contents and a link to the file on GitHub.

If the requests were rate limited, the store is a
`RateLimitedGistStore` and the files are `RateLimitedGistFile`s. These
rate-limited objects rendered with a link to the gist on GitHub and text
saying that Scribe has been rate-limited.

If somehow the file requested doesn't exist in the store, it displays
similarly to the rate-limited file but with "file missing" text instead
of "rate limited" text.

GitHub API docs: https://docs.github.com/en/rest/reference/gists
Rate Limiting docs:
https://docs.github.com/en/rest/overview/resources-in-the-rest-api#rate-
limiting
---
 spec/classes/embedded_converter_spec.cr      |  14 ++-
 spec/classes/gist_scanner_spec.cr            | 102 +++++++++++++++++++
 spec/classes/gist_store_spec.cr              |  58 +++++++++++
 spec/classes/paragraph_converter_spec.cr     |  18 ++--
 spec/classes/rate_limited_gist_store_spec.cr |  23 +++++
 spec/components/page_content_spec.cr         |  18 +++-
 spec/models/gist_file_spec.cr                |  34 +++++++
 spec/models/gist_params_spec.cr              |  33 ++++++
 src/classes/embedded_converter.cr            |  19 ++--
 src/classes/gist_scanner.cr                  |  28 +++++
 src/classes/gist_store.cr                    |  45 ++++++++
 src/classes/page_converter.cr                |  19 +++-
 src/classes/paragraph_converter.cr           |   9 +-
 src/classes/rate_limited_gist_store.cr       |   5 +
 src/clients/github_client.cr                 |  37 +++++++
 src/components/page_content.cr               |  15 ++-
 src/constants.cr                             |   2 +
 src/models/gist_file.cr                      |  89 ++++++++++++++++
 src/models/gist_params.cr                    |  30 ++++++
 src/models/nodes.cr                          |  14 ++-
 src/models/post_response.cr                  |  21 ++++
 src/version.cr                               |   2 +-
 22 files changed, 606 insertions(+), 29 deletions(-)
 create mode 100644 spec/classes/gist_scanner_spec.cr
 create mode 100644 spec/classes/gist_store_spec.cr
 create mode 100644 spec/classes/rate_limited_gist_store_spec.cr
 create mode 100644 spec/models/gist_file_spec.cr
 create mode 100644 spec/models/gist_params_spec.cr
 create mode 100644 src/classes/gist_scanner.cr
 create mode 100644 src/classes/gist_store.cr
 create mode 100644 src/classes/rate_limited_gist_store.cr
 create mode 100644 src/clients/github_client.cr
 create mode 100644 src/models/gist_file.cr
 create mode 100644 src/models/gist_params.cr
diff --git a/spec/classes/embedded_converter_spec.cr b/spec/classes/embedded_converter_spec.cr
index 4b7ec07..780bc92 100644
--- a/spec/classes/embedded_converter_spec.cr
+++ b/spec/classes/embedded_converter_spec.cr
@@ -5,6 +5,7 @@ include Nodes
 describe EmbeddedConverter do
   context "when the mediaResource has an iframeSrc value" do
     it "returns an EmbeddedContent node" do
+      store = GistStore.new
       paragraph = PostResponse::Paragraph.from_json <<-JSON
         {
           "text": "",
@@ -25,7 +26,7 @@ describe EmbeddedConverter do
         }
       JSON
 
-      result = EmbeddedConverter.convert(paragraph)
+      result = EmbeddedConverter.convert(paragraph, store)
 
       result.should eq(
         EmbeddedContent.new(
@@ -40,6 +41,7 @@ describe EmbeddedConverter do
   context "when the mediaResource has a blank iframeSrc value" do
     context "and the href is unknown" do
       it "returns an EmbeddedLink node" do
+        store = GistStore.new
         paragraph = PostResponse::Paragraph.from_json <<-JSON
           {
             "text": "",
@@ -60,7 +62,7 @@ describe EmbeddedConverter do
           }
         JSON
 
-        result = EmbeddedConverter.convert(paragraph)
+        result = EmbeddedConverter.convert(paragraph, store)
 
         result.should eq(EmbeddedLink.new(href: "https://example.com"))
       end
@@ -68,6 +70,7 @@ describe EmbeddedConverter do
 
     context "and the href is gist.github.com" do
       it "returns an GithubGist node" do
+        store = GistStore.new
         paragraph = PostResponse::Paragraph.from_json <<-JSON
           {
             "text": "",
@@ -88,10 +91,13 @@ describe EmbeddedConverter do
           }
         JSON
 
-        result = EmbeddedConverter.convert(paragraph)
+        result = EmbeddedConverter.convert(paragraph, store)
 
         result.should eq(
-          GithubGist.new(href: "https://gist.github.com/user/someid")
+          GithubGist.new(
+            href: "https://gist.github.com/user/someid",
+            gist_store: store
+          )
         )
       end
     end
diff --git a/spec/classes/gist_scanner_spec.cr b/spec/classes/gist_scanner_spec.cr
new file mode 100644
index 0000000..a5648e6
--- /dev/null
+++ b/spec/classes/gist_scanner_spec.cr
@@ -0,0 +1,102 @@
+require "../spec_helper"
+
+describe GistScanner do
+  it "returns gist ids from paragraphs" do
+    iframe = PostResponse::IFrame.new(
+      PostResponse::MediaResource.new(
+        href: "https://gist.github.com/user/123ABC",
+        iframeSrc: "",
+        iframeWidth: 0,
+        iframeHeight: 0
+      )
+    )
+    paragraphs = [
+      PostResponse::Paragraph.new(
+        text: "Check out this gist:",
+        type: PostResponse::ParagraphType::P,
+        markups: [] of PostResponse::Markup,
+        iframe: nil,
+        layout: nil,
+        metadata: nil
+      ),
+      PostResponse::Paragraph.new(
+        text: "",
+        type: PostResponse::ParagraphType::IFRAME,
+        markups: [] of PostResponse::Markup,
+        iframe: iframe,
+        layout: nil,
+        metadata: nil
+      ),
+    ]
+
+    result = GistScanner.new(paragraphs).scan
+
+    result.should eq(["https://gist.github.com/user/123ABC"])
+  end
+
+  it "returns ids without the file parameters" do
+    iframe = PostResponse::IFrame.new(
+      PostResponse::MediaResource.new(
+        href: "https://gist.github.com/user/123ABC?file=example.txt",
+        iframeSrc: "",
+        iframeWidth: 0,
+        iframeHeight: 0
+      )
+    )
+    paragraphs = [
+      PostResponse::Paragraph.new(
+        text: "",
+        type: PostResponse::ParagraphType::IFRAME,
+        markups: [] of PostResponse::Markup,
+        iframe: iframe,
+        layout: nil,
+        metadata: nil
+      ),
+    ]
+
+    result = GistScanner.new(paragraphs).scan
+
+    result.should eq(["https://gist.github.com/user/123ABC"])
+  end
+
+  it "returns a unique list of ids" do
+    iframe1 = PostResponse::IFrame.new(
+      PostResponse::MediaResource.new(
+        href: "https://gist.github.com/user/123ABC?file=example.txt",
+        iframeSrc: "",
+        iframeWidth: 0,
+        iframeHeight: 0
+      )
+    )
+    iframe2 = PostResponse::IFrame.new(
+      PostResponse::MediaResource.new(
+        href: "https://gist.github.com/user/123ABC?file=other.txt",
+        iframeSrc: "",
+        iframeWidth: 0,
+        iframeHeight: 0
+      )
+    )
+    paragraphs = [
+      PostResponse::Paragraph.new(
+        text: "",
+        type: PostResponse::ParagraphType::IFRAME,
+        markups: [] of PostResponse::Markup,
+        iframe: iframe1,
+        layout: nil,
+        metadata: nil
+      ),
+      PostResponse::Paragraph.new(
+        text: "",
+        type: PostResponse::ParagraphType::IFRAME,
+        markups: [] of PostResponse::Markup,
+        iframe: iframe2,
+        layout: nil,
+        metadata: nil
+      ),
+    ]
+
+    result = GistScanner.new(paragraphs).scan
+
+    result.should eq(["https://gist.github.com/user/123ABC"])
+  end
+end
diff --git a/spec/classes/gist_store_spec.cr b/spec/classes/gist_store_spec.cr
new file mode 100644
index 0000000..182d425
--- /dev/null
+++ b/spec/classes/gist_store_spec.cr
@@ -0,0 +1,58 @@
+require "../spec_helper"
+
+describe GistStore do
+  describe "#store_gist_file" do
+    describe "adds the gist file to the gist id" do
+      it "calls the github client" do
+        store = GistStore.new
+        file = GistFile.new(
+          filename: "filename",
+          content: "content",
+          raw_url: "raw_url"
+        )
+
+        store.store_gist_file("1", file)
+
+        store.store["1"].should eq([file])
+      end
+    end
+  end
+
+  describe "the gist does not exist in the store" do
+    it "returns a MissingGistFile" do
+      missing_file = MissingGistFile.new(id: "1", filename: "filename")
+      store = GistStore.new
+
+      file = store.get_gist_files(id: "1", filename: "filename")
+
+      file.should eq([missing_file])
+    end
+  end
+
+  describe "when a filename is given" do
+    it "returns the GistFile for that filename" do
+      store = GistStore.new
+      file1 = GistFile.new("one", "", "")
+      file2 = GistFile.new("two", "", "")
+      store.store["1"] = [file1, file2]
+
+      gists = store.get_gist_files(id: "1", filename: "one")
+
+      gists.should eq([file1])
+      gists.should_not contain([file2])
+    end
+  end
+
+  describe "when a filename is NOT given" do
+    it "returns all GistFiles" do
+      store = GistStore.new
+      file1 = GistFile.new("one", "", "")
+      file2 = GistFile.new("two", "", "")
+      store.store["1"] = [file1, file2]
+
+      gists = store.get_gist_files(id: "1", filename: nil)
+
+      gists.should eq([file1, file2])
+    end
+  end
+end
diff --git a/spec/classes/paragraph_converter_spec.cr b/spec/classes/paragraph_converter_spec.cr
index c3e30cb..e66b824 100644
--- a/spec/classes/paragraph_converter_spec.cr
+++ b/spec/classes/paragraph_converter_spec.cr
@@ -4,6 +4,7 @@ include Nodes
 
 describe ParagraphConverter do
   it "converts a simple structure with no markups" do
+    gist_store = GistStore.new
     paragraphs = Array(PostResponse::Paragraph).from_json <<-JSON
       [
         {
@@ -18,12 +19,13 @@ describe ParagraphConverter do
     JSON
     expected = [Heading3.new(children: [Text.new(content: "Title")] of Child)]
 
-    result = ParagraphConverter.new.convert(paragraphs)
+    result = ParagraphConverter.new.convert(paragraphs, gist_store)
 
     result.should eq expected
   end
 
   it "converts a simple structure with a markup" do
+    gist_store = GistStore.new
     paragraphs = Array(PostResponse::Paragraph).from_json <<-JSON
       [
         {
@@ -54,12 +56,13 @@ describe ParagraphConverter do
       ] of Child),
     ]
 
-    result = ParagraphConverter.new.convert(paragraphs)
+    result = ParagraphConverter.new.convert(paragraphs, gist_store)
 
     result.should eq expected
   end
 
   it "groups <ul> list items into one list" do
+    gist_store = GistStore.new
     paragraphs = Array(PostResponse::Paragraph).from_json <<-JSON
       [
         {
@@ -96,12 +99,13 @@ describe ParagraphConverter do
       Paragraph.new(children: [Text.new(content: "Not a list item")] of Child),
     ]
 
-    result = ParagraphConverter.new.convert(paragraphs)
+    result = ParagraphConverter.new.convert(paragraphs, gist_store)
 
     result.should eq expected
   end
 
   it "groups <ol> list items into one list" do
+    gist_store = GistStore.new
     paragraphs = Array(PostResponse::Paragraph).from_json <<-JSON
       [
         {
@@ -138,12 +142,13 @@ describe ParagraphConverter do
       Paragraph.new(children: [Text.new(content: "Not a list item")] of Child),
     ]
 
-    result = ParagraphConverter.new.convert(paragraphs)
+    result = ParagraphConverter.new.convert(paragraphs, gist_store)
 
     result.should eq expected
   end
 
   it "converts an IMG to a Figure" do
+    gist_store = GistStore.new
     paragraph = PostResponse::Paragraph.from_json <<-JSON
       {
         "text": "Image by someuser",
@@ -182,12 +187,13 @@ describe ParagraphConverter do
       ] of Child),
     ]
 
-    result = ParagraphConverter.new.convert([paragraph])
+    result = ParagraphConverter.new.convert([paragraph], gist_store)
 
     result.should eq expected
   end
 
   it "converts all the tags" do
+    gist_store = GistStore.new
     paragraphs = Array(PostResponse::Paragraph).from_json <<-JSON
       [
         {
@@ -333,7 +339,7 @@ describe ParagraphConverter do
       ] of Child),
     ]
 
-    result = ParagraphConverter.new.convert(paragraphs)
+    result = ParagraphConverter.new.convert(paragraphs, gist_store)
 
     result.should eq expected
   end
diff --git a/spec/classes/rate_limited_gist_store_spec.cr b/spec/classes/rate_limited_gist_store_spec.cr
new file mode 100644
index 0000000..a4a3327
--- /dev/null
+++ b/spec/classes/rate_limited_gist_store_spec.cr
@@ -0,0 +1,23 @@
+require "../spec_helper"
+
+describe RateLimitedGistStore do
+  describe "when a filename is given" do
+    it "returns a RateLimitedGistFile for that filename" do
+      store = RateLimitedGistStore.new
+
+      gists = store.get_gist_files(id: "1", filename: "one")
+
+      gists.should eq([RateLimitedGistFile.new(id: "1", filename: "one")])
+    end
+  end
+
+  describe "when a filename is NOT given" do
+    it "returns a single RateLimitedGistFile" do
+      store = RateLimitedGistStore.new
+
+      gists = store.get_gist_files(id: "1", filename: nil)
+
+      gists.should eq([RateLimitedGistFile.new(id: "1", filename: nil)])
+    end
+  end
+end
diff --git a/spec/components/page_content_spec.cr b/spec/components/page_content_spec.cr
index 7b40351..bdcd9ed 100644
--- a/spec/components/page_content_spec.cr
+++ b/spec/components/page_content_spec.cr
@@ -146,19 +146,33 @@ describe PageContent do
   end
 
   it "renders a GitHub Gist" do
+    store = GistStore.new
+    gist_file = GistFile.new(
+      filename: "example",
+      content: "content",
+      raw_url: "https://gist.githubusercontent.com/user/1/raw/abc/example"
+    )
+    store.store["1"] = [gist_file]
     page = Page.new(
       title: "Title",
       author: user_anchor_factory,
       created_at: Time.local,
       nodes: [
-        GithubGist.new(href: "https://gist.github.com/user/some_id"),
+        GithubGist.new(href: "https://gist.github.com/user/1", gist_store: store),
       ] of Child
     )
 
     html = PageContent.new(page: page).render_to_string
 
     html.should eq stripped_html <<-HTML
-      <script src="https://gist.github.com/user/some_id.js"></script>
+      <p>
+        <code>
+          <a href="https://gist.github.com/user/1#file-example">example</a>
+        </code>
+      </p>
+      <pre class="gist">
+        <code>content</code>
+      </pre>
     HTML
   end
 
diff --git a/spec/models/gist_file_spec.cr b/spec/models/gist_file_spec.cr
new file mode 100644
index 0000000..380adcd
--- /dev/null
+++ b/spec/models/gist_file_spec.cr
@@ -0,0 +1,34 @@
+require "../spec_helper"
+
+describe GistFile do
+  it "is parsed from json" do
+    json = <<-JSON
+      {
+        "filename": "example.txt",
+        "raw_url": "https://gist.githubusercontent.com/user/1D/raw/FFF/example.txt",
+        "content": "content"
+      }
+    JSON
+
+    gist_file = GistFile.from_json(json)
+
+    gist_file.filename.should eq("example.txt")
+    gist_file.content.should eq("content")
+    gist_file.raw_url.should eq("https://gist.githubusercontent.com/user/1D/raw/FFF/example.txt")
+  end
+
+  it "returns an href for the gist's webpage" do
+    json = <<-JSON
+      {
+        "filename": "example.txt",
+        "raw_url": "https://gist.githubusercontent.com/user/1D/raw/FFF/example.txt",
+        "content": "content"
+      }
+    JSON
+    gist_file = GistFile.from_json(json)
+
+    href = gist_file.href
+
+    href.should eq("https://gist.github.com/user/1D#file-example-txt")
+  end
+end
diff --git a/spec/models/gist_params_spec.cr b/spec/models/gist_params_spec.cr
new file mode 100644
index 0000000..df711be
--- /dev/null
+++ b/spec/models/gist_params_spec.cr
@@ -0,0 +1,33 @@
+require "../spec_helper"
+
+describe GistParams do
+  it "extracts params from the gist url" do
+    url = "https://gist.github.com/user/1D?file=example.txt"
+
+    params = GistParams.extract_from_url(url)
+
+    params.id.should eq("1D")
+    params.filename.should eq("example.txt")
+  end
+
+  describe "when no file param exists" do
+    it "does not extract a filename" do
+      url = "https://gist.github.com/user/1D"
+
+      params = GistParams.extract_from_url(url)
+
+      params.id.should eq("1D")
+      params.filename.should be_nil
+    end
+  end
+
+  describe "when the URL is not a gist URL" do
+    it "raises a MissingGistId exeption" do
+      url = "https://example.com"
+
+      expect_raises(GistParams::MissingGistId, message: "https://example.com") do
+        GistParams.extract_from_url(url)
+      end
+    end
+  end
+end
diff --git a/src/classes/embedded_converter.cr b/src/classes/embedded_converter.cr
index 6848bfc..925e54f 100644
--- a/src/classes/embedded_converter.cr
+++ b/src/classes/embedded_converter.cr
@@ -1,15 +1,22 @@
 class EmbeddedConverter
   include Nodes
 
-  GIST_HOST = "https://gist.github.com"
+  GIST_HOST_AND_SCHEME = "https://#{GIST_HOST}"
 
   getter paragraph : PostResponse::Paragraph
+  getter gist_store : GistStore | RateLimitedGistStore
 
-  def self.convert(paragraph : PostResponse::Paragraph) : Embedded | Empty
-    new(paragraph).convert
+  def self.convert(
+    paragraph : PostResponse::Paragraph,
+    gist_store : GistStore | RateLimitedGistStore
+  ) : Embedded | Empty
+    new(paragraph, gist_store).convert
   end
 
-  def initialize(@paragraph : PostResponse::Paragraph)
+  def initialize(
+    @paragraph : PostResponse::Paragraph,
+    @gist_store : GistStore | RateLimitedGistStore
+  )
   end
 
   def convert : Embedded | Empty
@@ -33,8 +40,8 @@ class EmbeddedConverter
   end
 
   private def custom_embed(media : PostResponse::MediaResource) : Embedded
-    if media.href.starts_with?(GIST_HOST)
-      GithubGist.new(href: media.href)
+    if media.href.starts_with?(GIST_HOST_AND_SCHEME)
+      GithubGist.new(href: media.href, gist_store: gist_store)
     else
       EmbeddedLink.new(href: media.href)
     end
diff --git a/src/classes/gist_scanner.cr b/src/classes/gist_scanner.cr
new file mode 100644
index 0000000..9fd6852
--- /dev/null
+++ b/src/classes/gist_scanner.cr
@@ -0,0 +1,28 @@
+class GistScanner
+  GIST_HOST_AND_SCHEME = "https://#{GIST_HOST}"
+
+  getter paragraphs : Array(PostResponse::Paragraph)
+
+  def initialize(@paragraphs : Array(PostResponse::Paragraph))
+  end
+
+  def scan
+    maybe_urls = paragraphs.compact_map do |paragraph|
+      Monads::Try(PostResponse::IFrame).new(->{ paragraph.iframe })
+        .to_maybe
+        .fmap(->(iframe : PostResponse::IFrame) { iframe.mediaResource })
+        .fmap(->(media : PostResponse::MediaResource) { media.href })
+        .value_or(nil)
+    end
+    maybe_urls
+      .select { |url| url.starts_with?(GIST_HOST_AND_SCHEME) }
+      .map { |url| url_without_params(url) }
+      .uniq
+  end
+
+  def url_without_params(url)
+    uri = URI.parse(url)
+    uri.query = nil
+    uri.to_s
+  end
+end
diff --git a/src/classes/gist_store.cr b/src/classes/gist_store.cr
new file mode 100644
index 0000000..9fd148a
--- /dev/null
+++ b/src/classes/gist_store.cr
@@ -0,0 +1,45 @@
+alias GistFiles = Array(GistFile)
+alias GistHash = Hash(String, GistFiles)
+
+class GistStore
+  property store : GistHash
+
+  def initialize(@store : GistHash = {} of String => GistFiles)
+  end
+
+  def store_gist_file(id : String, file : GistFile)
+    store[id] ||= [] of GistFile
+    store[id] << file
+  end
+
+  def get_gist_files(id : String, filename : String?) : Array(GistFile) | Array(MissingGistFile)
+    files = store[id]?
+    missing_file = MissingGistFile.new(id: id, filename: filename)
+    if files
+      if filename
+        find_gist_file(files, filename, missing_file)
+      else
+        files
+      end
+    else
+      return [missing_file]
+    end
+  end
+
+  private def find_gist_file(
+    files : Array(GistFile),
+    filename : String,
+    missing_file : MissingGistFile
+  ) : Array(GistFile) | Array(MissingGistFile)
+    gist_file = files.find { |file| file.filename == filename }
+    if gist_file
+      [gist_file]
+    else
+      [missing_file]
+    end
+  end
+
+  private def client_class
+    GithubClient
+  end
+end
diff --git a/src/classes/page_converter.cr b/src/classes/page_converter.cr
index 70f932f..a277f96 100644
--- a/src/classes/page_converter.cr
+++ b/src/classes/page_converter.cr
@@ -3,11 +3,12 @@ class PageConverter
     title, content = title_and_content(data)
     author = data.post.creator
     created_at = Time.unix_ms(data.post.createdAt)
+    gist_store = gist_store(content)
     Page.new(
       title: title,
       author: author,
       created_at: Time.unix_ms(data.post.createdAt),
-      nodes: ParagraphConverter.new.convert(content)
+      nodes: ParagraphConverter.new.convert(content, gist_store)
     )
   end
 
@@ -17,4 +18,20 @@ class PageConverter
     non_content_paragraphs = paragraphs.reject { |para| para.text == title }
     {title, non_content_paragraphs}
   end
+
+  private def gist_store(paragraphs) : GistStore | RateLimitedGistStore
+    store = GistStore.new
+    gist_urls = GistScanner.new(paragraphs).scan
+    gist_responses = gist_urls.map do |url|
+      params = GistParams.extract_from_url(url)
+      response = GithubClient.get_gist_response(params.id)
+      if response.is_a?(GithubClient::RateLimitedResponse)
+        return RateLimitedGistStore.new
+      end
+      JSON.parse(response.data.body)["files"].as_h.values.map do |json_any|
+        store.store_gist_file(params.id, GistFile.from_json(json_any.to_json))
+      end
+    end
+    store
+  end
 end
diff --git a/src/classes/paragraph_converter.cr b/src/classes/paragraph_converter.cr
index d62185f..00a76f4 100644
--- a/src/classes/paragraph_converter.cr
+++ b/src/classes/paragraph_converter.cr
@@ -1,7 +1,10 @@
 class ParagraphConverter
   include Nodes
 
-  def convert(paragraphs : Array(PostResponse::Paragraph)) : Array(Child)
+  def convert(
+    paragraphs : Array(PostResponse::Paragraph),
+    gist_store : GistStore | RateLimitedGistStore
+  ) : Array(Child)
     if paragraphs.first?.nil?
       return [Empty.new] of Child
     else
@@ -24,7 +27,7 @@ class ParagraphConverter
         node = Heading3.new(children: children)
       when PostResponse::ParagraphType::IFRAME
         paragraph = paragraphs.shift
-        node = EmbeddedConverter.convert(paragraph)
+        node = EmbeddedConverter.convert(paragraph, gist_store)
       when PostResponse::ParagraphType::IMG
         paragraph = paragraphs.shift
         node = convert_img(paragraph)
@@ -60,7 +63,7 @@ class ParagraphConverter
         node = Empty.new
       end
 
-      [node, convert(paragraphs)].flatten.reject(&.empty?)
+      [node, convert(paragraphs, gist_store)].flatten.reject(&.empty?)
     end
   end
 
diff --git a/src/classes/rate_limited_gist_store.cr b/src/classes/rate_limited_gist_store.cr
new file mode 100644
index 0000000..d5a2516
--- /dev/null
+++ b/src/classes/rate_limited_gist_store.cr
@@ -0,0 +1,5 @@
+class RateLimitedGistStore
+  def get_gist_files(id : String, filename : String?)
+    [RateLimitedGistFile.new(id: id, filename: filename)]
+  end
+end
diff --git a/src/clients/github_client.cr b/src/clients/github_client.cr
new file mode 100644
index 0000000..b6a5610
--- /dev/null
+++ b/src/clients/github_client.cr
@@ -0,0 +1,37 @@
+class GithubClient
+  class SuccessfulResponse
+    getter data : HTTP::Client::Response
+
+    def initialize(@data : HTTP::Client::Response)
+    end
+  end
+
+  class RateLimitedResponse
+  end
+
+  def self.get_gist_response(id : String) : SuccessfulResponse | RateLimitedResponse
+    new.get_gist_response(id)
+  end
+
+  def get_gist_response(id : String) : SuccessfulResponse | RateLimitedResponse
+    client = HTTP::Client.new("api.github.com", tls: true)
+    if username && password
+      client.basic_auth(username, password)
+    end
+    response = client.get("/gists/#{id}")
+    if response.status == HTTP::Status::FORBIDDEN &&
+       response.headers["X-RateLimit-Remaining"] == "0"
+      RateLimitedResponse.new
+    else
+      SuccessfulResponse.new(response)
+    end
+  end
+
+  private def username
+    ENV["GITHUB_USERNAME"]?
+  end
+
+  private def password
+    ENV["GITHUB_PERSONAL_ACCESS_TOKEN"]?
+  end
+end
diff --git a/src/components/page_content.cr b/src/components/page_content.cr
index 7ffb41f..2849bbc 100644
--- a/src/components/page_content.cr
+++ b/src/components/page_content.cr
@@ -77,8 +77,19 @@ class PageContent < BaseComponent
     end
   end
 
-  def render_child(child : GithubGist)
-    script src: child.src
+  def render_child(gist : GithubGist)
+    gist.files.map { |gist_file| render_child(gist_file) }
+  end
+
+  def render_child(gist_file : GistFile | MissingGistFile | RateLimitedGistFile)
+    para do
+      code do
+        a gist_file.filename, href: gist_file.href
+      end
+    end
+    pre class: "gist" do
+      code gist_file.content
+    end
   end
 
   def render_child(node : Heading1)
diff --git a/src/constants.cr b/src/constants.cr
index a136d90..f750574 100644
--- a/src/constants.cr
+++ b/src/constants.cr
@@ -1,2 +1,4 @@
 # https://stackoverflow.com/questions/2669690/
 JSON_HIJACK_STRING = "])}while(1);</x>"
+
+GIST_HOST = "gist.github.com"
diff --git a/src/models/gist_file.cr b/src/models/gist_file.cr
new file mode 100644
index 0000000..18bced1
--- /dev/null
+++ b/src/models/gist_file.cr
@@ -0,0 +1,89 @@
+class GistFile
+  include JSON::Serializable
+
+  getter filename : String
+  getter content : String
+  getter raw_url : String
+
+  def initialize(@filename : String, @content : String, @raw_url : String)
+  end
+
+  def href
+    uri = URI.parse(raw_url)
+    uri.host = GIST_HOST
+    path_and_file_anchor = path_and_file_anchor(uri)
+    uri.path = path_and_file_anchor.path
+    uri.fragment = path_and_file_anchor.file_anchor
+    uri.to_s
+  end
+
+  private def path_and_file_anchor(uri : URI)
+    path_parts = uri.path.split("/")
+    PathAndFileAnchor.new(
+      path: [path_parts[1], path_parts[2]].join("/"),
+      filename: path_parts[-1]
+    )
+  end
+
+  class PathAndFileAnchor
+    getter file_anchor : String
+    getter path : String
+
+    def initialize(@path : String, filename : String)
+      @file_anchor = "file-" + filename.tr(" ", "-").tr(".", "-")
+    end
+  end
+end
+
+class MissingGistFile
+  GIST_HOST_AND_SCHEME = "https://#{GIST_HOST}"
+
+  def initialize(@id : String, @filename : String?)
+  end
+
+  def content
+    <<-TEXT
+      Gist file missing.
+      Click on filename to go to gist.
+    TEXT
+  end
+
+  def href
+    GIST_HOST_AND_SCHEME + "/#{@id}"
+  end
+
+  def filename
+    @filename || "Unknown filename"
+  end
+
+  def ==(other : MissingGistFile)
+    other.filename == filename && other.href == href
+  end
+end
+
+class RateLimitedGistFile
+  GIST_HOST_AND_SCHEME = "https://#{GIST_HOST}"
+
+  def initialize(@id : String, @filename : String?)
+  end
+
+  def content
+    <<-TEXT
+      Can't fetch gist.
+      GitHub rate limit reached.
+      Click on filename to go to gist.
+    TEXT
+  end
+
+  def href
+    GIST_HOST_AND_SCHEME + "/#{@id}"
+  end
+
+  def filename
+    @filename || "Unknown filename"
+  end
+
+  def ==(other : RateLimitedGistFile)
+    other.filename == filename && other.href == href
+  end
+end
diff --git a/src/models/gist_params.cr b/src/models/gist_params.cr
new file mode 100644
index 0000000..c8d5dfe
--- /dev/null
+++ b/src/models/gist_params.cr
@@ -0,0 +1,30 @@
+class GistParams
+  class MissingGistId < Exception
+  end
+
+  GIST_ID_REGEX = /[a-f\d]+$/i
+
+  getter id : String
+  getter filename : String?
+
+  def self.extract_from_url(href : String)
+    uri = URI.parse(href)
+    maybe_id = Monads::Try(Regex::MatchData)
+      .new(->{ uri.path.match(GIST_ID_REGEX) })
+      .to_maybe
+      .fmap(->(matches : Regex::MatchData) { matches[0] })
+    case maybe_id
+    in Monads::Just
+      id = maybe_id.value!
+    in Monads::Nothing, Monads::Maybe
+      raise MissingGistId.new(href)
+    end
+
+    filename = uri.query_params["file"]?
+
+    new(id: id, filename: filename)
+  end
+
+  def initialize(@id : String, @filename : String?)
+  end
+end
diff --git a/src/models/nodes.cr b/src/models/nodes.cr
index 49fee91..0ddce68 100644
--- a/src/models/nodes.cr
+++ b/src/models/nodes.cr
@@ -217,15 +217,21 @@ module Nodes
   end
 
   class GithubGist
-    def initialize(@href : String)
+    getter gist_store : GistStore | RateLimitedGistStore
+
+    def initialize(@href : String, @gist_store : GistStore | RateLimitedGistStore)
     end
 
-    def src
-      "#{@href}.js"
+    def files : Array(GistFile) | Array(MissingGistFile) | Array(RateLimitedGistFile)
+      gist_store.get_gist_files(params.id, params.filename)
+    end
+
+    private def params
+      GistParams.extract_from_url(@href)
     end
 
     def ==(other : GithubGist)
-      other.src == src
+      other.gist_store == gist_store
     end
 
     def empty?
diff --git a/src/models/post_response.cr b/src/models/post_response.cr
index ff960d0..aedf898 100644
--- a/src/models/post_response.cr
+++ b/src/models/post_response.cr
@@ -38,6 +38,16 @@ class PostResponse
     property iframe : IFrame?
     property layout : String?
     property metadata : Metadata?
+
+    def initialize(
+      @text : String?,
+      @type : ParagraphType,
+      @markups : Array(Markup),
+      @iframe : IFrame?,
+      @layout : String?,
+      @metadata : Metadata?
+    )
+    end
   end
 
   enum ParagraphType
@@ -80,6 +90,9 @@ class PostResponse
 
   class IFrame < Base
     property mediaResource : MediaResource
+
+    def initialize(@mediaResource : MediaResource)
+    end
   end
 
   class MediaResource < Base
@@ -87,6 +100,14 @@ class PostResponse
     property iframeSrc : String
     property iframeWidth : Int32
     property iframeHeight : Int32
+
+    def initialize(
+      @href : String,
+      @iframeSrc : String,
+      @iframeWidth : Int32,
+      @iframeHeight : Int32
+    )
+    end
   end
 
   class Metadata < Base
diff --git a/src/version.cr b/src/version.cr
index efa2c88..c22052a 100644
--- a/src/version.cr
+++ b/src/version.cr
@@ -1,3 +1,3 @@
 module Scribe
-  VERSION = "2022-01-08"
+  VERSION = "2022-01-23"
 end