Proxy GitHub gists with rate limiting
Previously, GitHub gists were embedded. The gist url would be detected in a paragraph and the page would render a script like: ```html <script src="https://gist.github.com/user/gist_id.js"></script> ``` The script would then embed the gist on the page. However, gists contain multiple files. It's technically possible to embed a single file in the same way by appending a `file` query param: ```html <script src="https://gist.github.com/user/gist_id.js?file=foo.txt"></script> ``` I wanted to try and tackle proxying gists instead. Overview -------- At a high level the PageConverter kicks off the work of fetching and storing the gist content, then sends that content down to the `ParagraphConverter`. When a paragraph comes up that contains a gist embed, it retrieves the previously fetched content. This allows all the necessary content to be fetched up front so the minimum number of requests need to be made. Fetching Gists -------------- There is now a `GithubClient` class that gets gist content from GitHub's ReST API. The gist API response looks something like this (non-relevant keys removed): ```json { "files": { "file-one.txt": { "filename": "file-one.txt", "raw_url": "https://gist.githubusercontent.com/<username>/<id>/raw/<file_id>/file-o ne.txt", "content": "..." }, "file-two.txt": { "filename": "file-two.txt", "raw_url": "https://gist.githubusercontent.com/<username>/<id>/raw/<file_id>/file-t wo.txt", "content": "..." } } } ``` That response gets turned into a bunch of `GistFile` objects that are then stored in a request-level `GistStore`. Crystal's JSON parsing does not make it easy to parse json with arbitrary keys into objects. This is because each key corresponds to an object property, like `property name : String`. If Crystal doesn't know the keys ahead of time, there's no way to know what methods to create. That's a problem here because the key for each gist file is the unique filename. Fortunately, the keys for each _file_ follows the same pattern and are easy to parse into a `GistFile` object. To turn gist file JSON into Crystal objects, the `GithubClient` turns the whole response into a `JSON::Any` which is like a Hash. Then it extracts just the file data objects and parses those into `GistFile` objects. Those `GistFile` objects are then cached in a `GistStore` that is shared for the page, which means one gist cache per request/article. `GistFile` objects can be fetched out of the store by file, or if no file is specified, it returns all files in the gist. The GistFile is rendered as a link of the file's name to the file in the gist on GitHub, and then a code block of the contents of the file. In summary, the `PageConverter`: * Scans the paragraphs for GitHub gists using `GistScanner` * Requests their data from GitHub using the `GithubClient` * Parses the response into `GistFile`s and populates the `GistStore` * Passes that `GistStore` to the `ParagraphConverter` to use when constructing the page nodes Caching ------- GitHub limits API requests to 5000/hour with a valid api token and 60/hour without. 60 is pretty tight for the usage that scribe.rip gets, but 5000 is reasonable most of the time. Not every article has an embedded gist, but some articles have multiple gists. A viral article (of which Scribe has seen two at the time of this commit) might receive a little over 127k hits/day, which is an average of over 5300/hour. If that article had a gist, Scribe would reach the API limit during parts of the day with high traffic. If it had multiple gists, it would hit it even more. However, average traffic is around 30k visits/day which would be well under the limit, assuming average load. To help not hit that limit, a `GistStore` holds all the `GistFile` objects per gist. The logic in `GistScanner` is smart enough to only return unique gist URLs so each gist is only requested once even if multiple files from one gist exist in an article. This limits the number of times Scribe hits the GitHub API. If Scribe is rate-limited, instead of populating a `GistCache` the `PageConverter` will create a `RateLimitedGistStore`. This is an object that acts like the `GistStore` but returns `RateLimitedGistFile` objects instead of `GistFile` objects. This allows Scribe to gracefully degrade in the event of reaching the rate limit. If rate-limiting becomes a regular problem, Scribe could also be reworked to fallback to the embedded gists again. API Credentials --------------- API credentials are in the form of a GitHub username and a personal access token attached to that username. To get a token, visit https://github.com/settings/tokens and create a new token. The only permission it needs is `gist`. This token is set via the `GITHUB_PERSONAL_ACCESS_TOKEN` environment variable. The username also needs to be set via `GITHUB_USERNAME`. When developing locally, these can both be set in the .env file. Authentication is probably not necessary locally, but it's there if you want to test. If either token is missing, unauthenticated requests are made. Rendering --------- The node tree itself holds a `GithubGist` object. It has a reference to the `GistStore` and the original gist URL. When it renders the page requests the gist's `files`. The gist ID and optional file are detected, and then used to request the file(s) from the `GistStore`. Gists render as a list of each files contents and a link to the file on GitHub. If the requests were rate limited, the store is a `RateLimitedGistStore` and the files are `RateLimitedGistFile`s. These rate-limited objects rendered with a link to the gist on GitHub and text saying that Scribe has been rate-limited. If somehow the file requested doesn't exist in the store, it displays similarly to the rate-limited file but with "file missing" text instead of "rate limited" text. GitHub API docs: https://docs.github.com/en/rest/reference/gists Rate Limiting docs: https://docs.github.com/en/rest/overview/resources-in-the-rest-api#rate- limiting
This commit is contained in:
parent
8737ca7897
commit
7518a035b1
22 changed files with 606 additions and 29 deletions
|
@ -5,6 +5,7 @@ include Nodes
|
|||
describe EmbeddedConverter do
|
||||
context "when the mediaResource has an iframeSrc value" do
|
||||
it "returns an EmbeddedContent node" do
|
||||
store = GistStore.new
|
||||
paragraph = PostResponse::Paragraph.from_json <<-JSON
|
||||
{
|
||||
"text": "",
|
||||
|
@ -25,7 +26,7 @@ describe EmbeddedConverter do
|
|||
}
|
||||
JSON
|
||||
|
||||
result = EmbeddedConverter.convert(paragraph)
|
||||
result = EmbeddedConverter.convert(paragraph, store)
|
||||
|
||||
result.should eq(
|
||||
EmbeddedContent.new(
|
||||
|
@ -40,6 +41,7 @@ describe EmbeddedConverter do
|
|||
context "when the mediaResource has a blank iframeSrc value" do
|
||||
context "and the href is unknown" do
|
||||
it "returns an EmbeddedLink node" do
|
||||
store = GistStore.new
|
||||
paragraph = PostResponse::Paragraph.from_json <<-JSON
|
||||
{
|
||||
"text": "",
|
||||
|
@ -60,7 +62,7 @@ describe EmbeddedConverter do
|
|||
}
|
||||
JSON
|
||||
|
||||
result = EmbeddedConverter.convert(paragraph)
|
||||
result = EmbeddedConverter.convert(paragraph, store)
|
||||
|
||||
result.should eq(EmbeddedLink.new(href: "https://example.com"))
|
||||
end
|
||||
|
@ -68,6 +70,7 @@ describe EmbeddedConverter do
|
|||
|
||||
context "and the href is gist.github.com" do
|
||||
it "returns an GithubGist node" do
|
||||
store = GistStore.new
|
||||
paragraph = PostResponse::Paragraph.from_json <<-JSON
|
||||
{
|
||||
"text": "",
|
||||
|
@ -88,10 +91,13 @@ describe EmbeddedConverter do
|
|||
}
|
||||
JSON
|
||||
|
||||
result = EmbeddedConverter.convert(paragraph)
|
||||
result = EmbeddedConverter.convert(paragraph, store)
|
||||
|
||||
result.should eq(
|
||||
GithubGist.new(href: "https://gist.github.com/user/someid")
|
||||
GithubGist.new(
|
||||
href: "https://gist.github.com/user/someid",
|
||||
gist_store: store
|
||||
)
|
||||
)
|
||||
end
|
||||
end
|
||||
|
|
102
spec/classes/gist_scanner_spec.cr
Normal file
102
spec/classes/gist_scanner_spec.cr
Normal file
|
@ -0,0 +1,102 @@
|
|||
require "../spec_helper"
|
||||
|
||||
describe GistScanner do
|
||||
it "returns gist ids from paragraphs" do
|
||||
iframe = PostResponse::IFrame.new(
|
||||
PostResponse::MediaResource.new(
|
||||
href: "https://gist.github.com/user/123ABC",
|
||||
iframeSrc: "",
|
||||
iframeWidth: 0,
|
||||
iframeHeight: 0
|
||||
)
|
||||
)
|
||||
paragraphs = [
|
||||
PostResponse::Paragraph.new(
|
||||
text: "Check out this gist:",
|
||||
type: PostResponse::ParagraphType::P,
|
||||
markups: [] of PostResponse::Markup,
|
||||
iframe: nil,
|
||||
layout: nil,
|
||||
metadata: nil
|
||||
),
|
||||
PostResponse::Paragraph.new(
|
||||
text: "",
|
||||
type: PostResponse::ParagraphType::IFRAME,
|
||||
markups: [] of PostResponse::Markup,
|
||||
iframe: iframe,
|
||||
layout: nil,
|
||||
metadata: nil
|
||||
),
|
||||
]
|
||||
|
||||
result = GistScanner.new(paragraphs).scan
|
||||
|
||||
result.should eq(["https://gist.github.com/user/123ABC"])
|
||||
end
|
||||
|
||||
it "returns ids without the file parameters" do
|
||||
iframe = PostResponse::IFrame.new(
|
||||
PostResponse::MediaResource.new(
|
||||
href: "https://gist.github.com/user/123ABC?file=example.txt",
|
||||
iframeSrc: "",
|
||||
iframeWidth: 0,
|
||||
iframeHeight: 0
|
||||
)
|
||||
)
|
||||
paragraphs = [
|
||||
PostResponse::Paragraph.new(
|
||||
text: "",
|
||||
type: PostResponse::ParagraphType::IFRAME,
|
||||
markups: [] of PostResponse::Markup,
|
||||
iframe: iframe,
|
||||
layout: nil,
|
||||
metadata: nil
|
||||
),
|
||||
]
|
||||
|
||||
result = GistScanner.new(paragraphs).scan
|
||||
|
||||
result.should eq(["https://gist.github.com/user/123ABC"])
|
||||
end
|
||||
|
||||
it "returns a unique list of ids" do
|
||||
iframe1 = PostResponse::IFrame.new(
|
||||
PostResponse::MediaResource.new(
|
||||
href: "https://gist.github.com/user/123ABC?file=example.txt",
|
||||
iframeSrc: "",
|
||||
iframeWidth: 0,
|
||||
iframeHeight: 0
|
||||
)
|
||||
)
|
||||
iframe2 = PostResponse::IFrame.new(
|
||||
PostResponse::MediaResource.new(
|
||||
href: "https://gist.github.com/user/123ABC?file=other.txt",
|
||||
iframeSrc: "",
|
||||
iframeWidth: 0,
|
||||
iframeHeight: 0
|
||||
)
|
||||
)
|
||||
paragraphs = [
|
||||
PostResponse::Paragraph.new(
|
||||
text: "",
|
||||
type: PostResponse::ParagraphType::IFRAME,
|
||||
markups: [] of PostResponse::Markup,
|
||||
iframe: iframe1,
|
||||
layout: nil,
|
||||
metadata: nil
|
||||
),
|
||||
PostResponse::Paragraph.new(
|
||||
text: "",
|
||||
type: PostResponse::ParagraphType::IFRAME,
|
||||
markups: [] of PostResponse::Markup,
|
||||
iframe: iframe2,
|
||||
layout: nil,
|
||||
metadata: nil
|
||||
),
|
||||
]
|
||||
|
||||
result = GistScanner.new(paragraphs).scan
|
||||
|
||||
result.should eq(["https://gist.github.com/user/123ABC"])
|
||||
end
|
||||
end
|
58
spec/classes/gist_store_spec.cr
Normal file
58
spec/classes/gist_store_spec.cr
Normal file
|
@ -0,0 +1,58 @@
|
|||
require "../spec_helper"
|
||||
|
||||
describe GistStore do
|
||||
describe "#store_gist_file" do
|
||||
describe "adds the gist file to the gist id" do
|
||||
it "calls the github client" do
|
||||
store = GistStore.new
|
||||
file = GistFile.new(
|
||||
filename: "filename",
|
||||
content: "content",
|
||||
raw_url: "raw_url"
|
||||
)
|
||||
|
||||
store.store_gist_file("1", file)
|
||||
|
||||
store.store["1"].should eq([file])
|
||||
end
|
||||
end
|
||||
end
|
||||
|
||||
describe "the gist does not exist in the store" do
|
||||
it "returns a MissingGistFile" do
|
||||
missing_file = MissingGistFile.new(id: "1", filename: "filename")
|
||||
store = GistStore.new
|
||||
|
||||
file = store.get_gist_files(id: "1", filename: "filename")
|
||||
|
||||
file.should eq([missing_file])
|
||||
end
|
||||
end
|
||||
|
||||
describe "when a filename is given" do
|
||||
it "returns the GistFile for that filename" do
|
||||
store = GistStore.new
|
||||
file1 = GistFile.new("one", "", "")
|
||||
file2 = GistFile.new("two", "", "")
|
||||
store.store["1"] = [file1, file2]
|
||||
|
||||
gists = store.get_gist_files(id: "1", filename: "one")
|
||||
|
||||
gists.should eq([file1])
|
||||
gists.should_not contain([file2])
|
||||
end
|
||||
end
|
||||
|
||||
describe "when a filename is NOT given" do
|
||||
it "returns all GistFiles" do
|
||||
store = GistStore.new
|
||||
file1 = GistFile.new("one", "", "")
|
||||
file2 = GistFile.new("two", "", "")
|
||||
store.store["1"] = [file1, file2]
|
||||
|
||||
gists = store.get_gist_files(id: "1", filename: nil)
|
||||
|
||||
gists.should eq([file1, file2])
|
||||
end
|
||||
end
|
||||
end
|
|
@ -4,6 +4,7 @@ include Nodes
|
|||
|
||||
describe ParagraphConverter do
|
||||
it "converts a simple structure with no markups" do
|
||||
gist_store = GistStore.new
|
||||
paragraphs = Array(PostResponse::Paragraph).from_json <<-JSON
|
||||
[
|
||||
{
|
||||
|
@ -18,12 +19,13 @@ describe ParagraphConverter do
|
|||
JSON
|
||||
expected = [Heading3.new(children: [Text.new(content: "Title")] of Child)]
|
||||
|
||||
result = ParagraphConverter.new.convert(paragraphs)
|
||||
result = ParagraphConverter.new.convert(paragraphs, gist_store)
|
||||
|
||||
result.should eq expected
|
||||
end
|
||||
|
||||
it "converts a simple structure with a markup" do
|
||||
gist_store = GistStore.new
|
||||
paragraphs = Array(PostResponse::Paragraph).from_json <<-JSON
|
||||
[
|
||||
{
|
||||
|
@ -54,12 +56,13 @@ describe ParagraphConverter do
|
|||
] of Child),
|
||||
]
|
||||
|
||||
result = ParagraphConverter.new.convert(paragraphs)
|
||||
result = ParagraphConverter.new.convert(paragraphs, gist_store)
|
||||
|
||||
result.should eq expected
|
||||
end
|
||||
|
||||
it "groups <ul> list items into one list" do
|
||||
gist_store = GistStore.new
|
||||
paragraphs = Array(PostResponse::Paragraph).from_json <<-JSON
|
||||
[
|
||||
{
|
||||
|
@ -96,12 +99,13 @@ describe ParagraphConverter do
|
|||
Paragraph.new(children: [Text.new(content: "Not a list item")] of Child),
|
||||
]
|
||||
|
||||
result = ParagraphConverter.new.convert(paragraphs)
|
||||
result = ParagraphConverter.new.convert(paragraphs, gist_store)
|
||||
|
||||
result.should eq expected
|
||||
end
|
||||
|
||||
it "groups <ol> list items into one list" do
|
||||
gist_store = GistStore.new
|
||||
paragraphs = Array(PostResponse::Paragraph).from_json <<-JSON
|
||||
[
|
||||
{
|
||||
|
@ -138,12 +142,13 @@ describe ParagraphConverter do
|
|||
Paragraph.new(children: [Text.new(content: "Not a list item")] of Child),
|
||||
]
|
||||
|
||||
result = ParagraphConverter.new.convert(paragraphs)
|
||||
result = ParagraphConverter.new.convert(paragraphs, gist_store)
|
||||
|
||||
result.should eq expected
|
||||
end
|
||||
|
||||
it "converts an IMG to a Figure" do
|
||||
gist_store = GistStore.new
|
||||
paragraph = PostResponse::Paragraph.from_json <<-JSON
|
||||
{
|
||||
"text": "Image by someuser",
|
||||
|
@ -182,12 +187,13 @@ describe ParagraphConverter do
|
|||
] of Child),
|
||||
]
|
||||
|
||||
result = ParagraphConverter.new.convert([paragraph])
|
||||
result = ParagraphConverter.new.convert([paragraph], gist_store)
|
||||
|
||||
result.should eq expected
|
||||
end
|
||||
|
||||
it "converts all the tags" do
|
||||
gist_store = GistStore.new
|
||||
paragraphs = Array(PostResponse::Paragraph).from_json <<-JSON
|
||||
[
|
||||
{
|
||||
|
@ -333,7 +339,7 @@ describe ParagraphConverter do
|
|||
] of Child),
|
||||
]
|
||||
|
||||
result = ParagraphConverter.new.convert(paragraphs)
|
||||
result = ParagraphConverter.new.convert(paragraphs, gist_store)
|
||||
|
||||
result.should eq expected
|
||||
end
|
||||
|
|
23
spec/classes/rate_limited_gist_store_spec.cr
Normal file
23
spec/classes/rate_limited_gist_store_spec.cr
Normal file
|
@ -0,0 +1,23 @@
|
|||
require "../spec_helper"
|
||||
|
||||
describe RateLimitedGistStore do
|
||||
describe "when a filename is given" do
|
||||
it "returns a RateLimitedGistFile for that filename" do
|
||||
store = RateLimitedGistStore.new
|
||||
|
||||
gists = store.get_gist_files(id: "1", filename: "one")
|
||||
|
||||
gists.should eq([RateLimitedGistFile.new(id: "1", filename: "one")])
|
||||
end
|
||||
end
|
||||
|
||||
describe "when a filename is NOT given" do
|
||||
it "returns a single RateLimitedGistFile" do
|
||||
store = RateLimitedGistStore.new
|
||||
|
||||
gists = store.get_gist_files(id: "1", filename: nil)
|
||||
|
||||
gists.should eq([RateLimitedGistFile.new(id: "1", filename: nil)])
|
||||
end
|
||||
end
|
||||
end
|
|
@ -146,19 +146,33 @@ describe PageContent do
|
|||
end
|
||||
|
||||
it "renders a GitHub Gist" do
|
||||
store = GistStore.new
|
||||
gist_file = GistFile.new(
|
||||
filename: "example",
|
||||
content: "content",
|
||||
raw_url: "https://gist.githubusercontent.com/user/1/raw/abc/example"
|
||||
)
|
||||
store.store["1"] = [gist_file]
|
||||
page = Page.new(
|
||||
title: "Title",
|
||||
author: user_anchor_factory,
|
||||
created_at: Time.local,
|
||||
nodes: [
|
||||
GithubGist.new(href: "https://gist.github.com/user/some_id"),
|
||||
GithubGist.new(href: "https://gist.github.com/user/1", gist_store: store),
|
||||
] of Child
|
||||
)
|
||||
|
||||
html = PageContent.new(page: page).render_to_string
|
||||
|
||||
html.should eq stripped_html <<-HTML
|
||||
<script src="https://gist.github.com/user/some_id.js"></script>
|
||||
<p>
|
||||
<code>
|
||||
<a href="https://gist.github.com/user/1#file-example">example</a>
|
||||
</code>
|
||||
</p>
|
||||
<pre class="gist">
|
||||
<code>content</code>
|
||||
</pre>
|
||||
HTML
|
||||
end
|
||||
|
||||
|
|
34
spec/models/gist_file_spec.cr
Normal file
34
spec/models/gist_file_spec.cr
Normal file
|
@ -0,0 +1,34 @@
|
|||
require "../spec_helper"
|
||||
|
||||
describe GistFile do
|
||||
it "is parsed from json" do
|
||||
json = <<-JSON
|
||||
{
|
||||
"filename": "example.txt",
|
||||
"raw_url": "https://gist.githubusercontent.com/user/1D/raw/FFF/example.txt",
|
||||
"content": "content"
|
||||
}
|
||||
JSON
|
||||
|
||||
gist_file = GistFile.from_json(json)
|
||||
|
||||
gist_file.filename.should eq("example.txt")
|
||||
gist_file.content.should eq("content")
|
||||
gist_file.raw_url.should eq("https://gist.githubusercontent.com/user/1D/raw/FFF/example.txt")
|
||||
end
|
||||
|
||||
it "returns an href for the gist's webpage" do
|
||||
json = <<-JSON
|
||||
{
|
||||
"filename": "example.txt",
|
||||
"raw_url": "https://gist.githubusercontent.com/user/1D/raw/FFF/example.txt",
|
||||
"content": "content"
|
||||
}
|
||||
JSON
|
||||
gist_file = GistFile.from_json(json)
|
||||
|
||||
href = gist_file.href
|
||||
|
||||
href.should eq("https://gist.github.com/user/1D#file-example-txt")
|
||||
end
|
||||
end
|
33
spec/models/gist_params_spec.cr
Normal file
33
spec/models/gist_params_spec.cr
Normal file
|
@ -0,0 +1,33 @@
|
|||
require "../spec_helper"
|
||||
|
||||
describe GistParams do
|
||||
it "extracts params from the gist url" do
|
||||
url = "https://gist.github.com/user/1D?file=example.txt"
|
||||
|
||||
params = GistParams.extract_from_url(url)
|
||||
|
||||
params.id.should eq("1D")
|
||||
params.filename.should eq("example.txt")
|
||||
end
|
||||
|
||||
describe "when no file param exists" do
|
||||
it "does not extract a filename" do
|
||||
url = "https://gist.github.com/user/1D"
|
||||
|
||||
params = GistParams.extract_from_url(url)
|
||||
|
||||
params.id.should eq("1D")
|
||||
params.filename.should be_nil
|
||||
end
|
||||
end
|
||||
|
||||
describe "when the URL is not a gist URL" do
|
||||
it "raises a MissingGistId exeption" do
|
||||
url = "https://example.com"
|
||||
|
||||
expect_raises(GistParams::MissingGistId, message: "https://example.com") do
|
||||
GistParams.extract_from_url(url)
|
||||
end
|
||||
end
|
||||
end
|
||||
end
|
|
@ -1,15 +1,22 @@
|
|||
class EmbeddedConverter
|
||||
include Nodes
|
||||
|
||||
GIST_HOST = "https://gist.github.com"
|
||||
GIST_HOST_AND_SCHEME = "https://#{GIST_HOST}"
|
||||
|
||||
getter paragraph : PostResponse::Paragraph
|
||||
getter gist_store : GistStore | RateLimitedGistStore
|
||||
|
||||
def self.convert(paragraph : PostResponse::Paragraph) : Embedded | Empty
|
||||
new(paragraph).convert
|
||||
def self.convert(
|
||||
paragraph : PostResponse::Paragraph,
|
||||
gist_store : GistStore | RateLimitedGistStore
|
||||
) : Embedded | Empty
|
||||
new(paragraph, gist_store).convert
|
||||
end
|
||||
|
||||
def initialize(@paragraph : PostResponse::Paragraph)
|
||||
def initialize(
|
||||
@paragraph : PostResponse::Paragraph,
|
||||
@gist_store : GistStore | RateLimitedGistStore
|
||||
)
|
||||
end
|
||||
|
||||
def convert : Embedded | Empty
|
||||
|
@ -33,8 +40,8 @@ class EmbeddedConverter
|
|||
end
|
||||
|
||||
private def custom_embed(media : PostResponse::MediaResource) : Embedded
|
||||
if media.href.starts_with?(GIST_HOST)
|
||||
GithubGist.new(href: media.href)
|
||||
if media.href.starts_with?(GIST_HOST_AND_SCHEME)
|
||||
GithubGist.new(href: media.href, gist_store: gist_store)
|
||||
else
|
||||
EmbeddedLink.new(href: media.href)
|
||||
end
|
||||
|
|
28
src/classes/gist_scanner.cr
Normal file
28
src/classes/gist_scanner.cr
Normal file
|
@ -0,0 +1,28 @@
|
|||
class GistScanner
|
||||
GIST_HOST_AND_SCHEME = "https://#{GIST_HOST}"
|
||||
|
||||
getter paragraphs : Array(PostResponse::Paragraph)
|
||||
|
||||
def initialize(@paragraphs : Array(PostResponse::Paragraph))
|
||||
end
|
||||
|
||||
def scan
|
||||
maybe_urls = paragraphs.compact_map do |paragraph|
|
||||
Monads::Try(PostResponse::IFrame).new(->{ paragraph.iframe })
|
||||
.to_maybe
|
||||
.fmap(->(iframe : PostResponse::IFrame) { iframe.mediaResource })
|
||||
.fmap(->(media : PostResponse::MediaResource) { media.href })
|
||||
.value_or(nil)
|
||||
end
|
||||
maybe_urls
|
||||
.select { |url| url.starts_with?(GIST_HOST_AND_SCHEME) }
|
||||
.map { |url| url_without_params(url) }
|
||||
.uniq
|
||||
end
|
||||
|
||||
def url_without_params(url)
|
||||
uri = URI.parse(url)
|
||||
uri.query = nil
|
||||
uri.to_s
|
||||
end
|
||||
end
|
45
src/classes/gist_store.cr
Normal file
45
src/classes/gist_store.cr
Normal file
|
@ -0,0 +1,45 @@
|
|||
alias GistFiles = Array(GistFile)
|
||||
alias GistHash = Hash(String, GistFiles)
|
||||
|
||||
class GistStore
|
||||
property store : GistHash
|
||||
|
||||
def initialize(@store : GistHash = {} of String => GistFiles)
|
||||
end
|
||||
|
||||
def store_gist_file(id : String, file : GistFile)
|
||||
store[id] ||= [] of GistFile
|
||||
store[id] << file
|
||||
end
|
||||
|
||||
def get_gist_files(id : String, filename : String?) : Array(GistFile) | Array(MissingGistFile)
|
||||
files = store[id]?
|
||||
missing_file = MissingGistFile.new(id: id, filename: filename)
|
||||
if files
|
||||
if filename
|
||||
find_gist_file(files, filename, missing_file)
|
||||
else
|
||||
files
|
||||
end
|
||||
else
|
||||
return [missing_file]
|
||||
end
|
||||
end
|
||||
|
||||
private def find_gist_file(
|
||||
files : Array(GistFile),
|
||||
filename : String,
|
||||
missing_file : MissingGistFile
|
||||
) : Array(GistFile) | Array(MissingGistFile)
|
||||
gist_file = files.find { |file| file.filename == filename }
|
||||
if gist_file
|
||||
[gist_file]
|
||||
else
|
||||
[missing_file]
|
||||
end
|
||||
end
|
||||
|
||||
private def client_class
|
||||
GithubClient
|
||||
end
|
||||
end
|
|
@ -3,11 +3,12 @@ class PageConverter
|
|||
title, content = title_and_content(data)
|
||||
author = data.post.creator
|
||||
created_at = Time.unix_ms(data.post.createdAt)
|
||||
gist_store = gist_store(content)
|
||||
Page.new(
|
||||
title: title,
|
||||
author: author,
|
||||
created_at: Time.unix_ms(data.post.createdAt),
|
||||
nodes: ParagraphConverter.new.convert(content)
|
||||
nodes: ParagraphConverter.new.convert(content, gist_store)
|
||||
)
|
||||
end
|
||||
|
||||
|
@ -17,4 +18,20 @@ class PageConverter
|
|||
non_content_paragraphs = paragraphs.reject { |para| para.text == title }
|
||||
{title, non_content_paragraphs}
|
||||
end
|
||||
|
||||
private def gist_store(paragraphs) : GistStore | RateLimitedGistStore
|
||||
store = GistStore.new
|
||||
gist_urls = GistScanner.new(paragraphs).scan
|
||||
gist_responses = gist_urls.map do |url|
|
||||
params = GistParams.extract_from_url(url)
|
||||
response = GithubClient.get_gist_response(params.id)
|
||||
if response.is_a?(GithubClient::RateLimitedResponse)
|
||||
return RateLimitedGistStore.new
|
||||
end
|
||||
JSON.parse(response.data.body)["files"].as_h.values.map do |json_any|
|
||||
store.store_gist_file(params.id, GistFile.from_json(json_any.to_json))
|
||||
end
|
||||
end
|
||||
store
|
||||
end
|
||||
end
|
||||
|
|
|
@ -1,7 +1,10 @@
|
|||
class ParagraphConverter
|
||||
include Nodes
|
||||
|
||||
def convert(paragraphs : Array(PostResponse::Paragraph)) : Array(Child)
|
||||
def convert(
|
||||
paragraphs : Array(PostResponse::Paragraph),
|
||||
gist_store : GistStore | RateLimitedGistStore
|
||||
) : Array(Child)
|
||||
if paragraphs.first?.nil?
|
||||
return [Empty.new] of Child
|
||||
else
|
||||
|
@ -24,7 +27,7 @@ class ParagraphConverter
|
|||
node = Heading3.new(children: children)
|
||||
when PostResponse::ParagraphType::IFRAME
|
||||
paragraph = paragraphs.shift
|
||||
node = EmbeddedConverter.convert(paragraph)
|
||||
node = EmbeddedConverter.convert(paragraph, gist_store)
|
||||
when PostResponse::ParagraphType::IMG
|
||||
paragraph = paragraphs.shift
|
||||
node = convert_img(paragraph)
|
||||
|
@ -60,7 +63,7 @@ class ParagraphConverter
|
|||
node = Empty.new
|
||||
end
|
||||
|
||||
[node, convert(paragraphs)].flatten.reject(&.empty?)
|
||||
[node, convert(paragraphs, gist_store)].flatten.reject(&.empty?)
|
||||
end
|
||||
end
|
||||
|
||||
|
|
5
src/classes/rate_limited_gist_store.cr
Normal file
5
src/classes/rate_limited_gist_store.cr
Normal file
|
@ -0,0 +1,5 @@
|
|||
class RateLimitedGistStore
|
||||
def get_gist_files(id : String, filename : String?)
|
||||
[RateLimitedGistFile.new(id: id, filename: filename)]
|
||||
end
|
||||
end
|
37
src/clients/github_client.cr
Normal file
37
src/clients/github_client.cr
Normal file
|
@ -0,0 +1,37 @@
|
|||
class GithubClient
|
||||
class SuccessfulResponse
|
||||
getter data : HTTP::Client::Response
|
||||
|
||||
def initialize(@data : HTTP::Client::Response)
|
||||
end
|
||||
end
|
||||
|
||||
class RateLimitedResponse
|
||||
end
|
||||
|
||||
def self.get_gist_response(id : String) : SuccessfulResponse | RateLimitedResponse
|
||||
new.get_gist_response(id)
|
||||
end
|
||||
|
||||
def get_gist_response(id : String) : SuccessfulResponse | RateLimitedResponse
|
||||
client = HTTP::Client.new("api.github.com", tls: true)
|
||||
if username && password
|
||||
client.basic_auth(username, password)
|
||||
end
|
||||
response = client.get("/gists/#{id}")
|
||||
if response.status == HTTP::Status::FORBIDDEN &&
|
||||
response.headers["X-RateLimit-Remaining"] == "0"
|
||||
RateLimitedResponse.new
|
||||
else
|
||||
SuccessfulResponse.new(response)
|
||||
end
|
||||
end
|
||||
|
||||
private def username
|
||||
ENV["GITHUB_USERNAME"]?
|
||||
end
|
||||
|
||||
private def password
|
||||
ENV["GITHUB_PERSONAL_ACCESS_TOKEN"]?
|
||||
end
|
||||
end
|
|
@ -77,8 +77,19 @@ class PageContent < BaseComponent
|
|||
end
|
||||
end
|
||||
|
||||
def render_child(child : GithubGist)
|
||||
script src: child.src
|
||||
def render_child(gist : GithubGist)
|
||||
gist.files.map { |gist_file| render_child(gist_file) }
|
||||
end
|
||||
|
||||
def render_child(gist_file : GistFile | MissingGistFile | RateLimitedGistFile)
|
||||
para do
|
||||
code do
|
||||
a gist_file.filename, href: gist_file.href
|
||||
end
|
||||
end
|
||||
pre class: "gist" do
|
||||
code gist_file.content
|
||||
end
|
||||
end
|
||||
|
||||
def render_child(node : Heading1)
|
||||
|
|
|
@ -1,2 +1,4 @@
|
|||
# https://stackoverflow.com/questions/2669690/
|
||||
JSON_HIJACK_STRING = "])}while(1);</x>"
|
||||
|
||||
GIST_HOST = "gist.github.com"
|
||||
|
|
89
src/models/gist_file.cr
Normal file
89
src/models/gist_file.cr
Normal file
|
@ -0,0 +1,89 @@
|
|||
class GistFile
|
||||
include JSON::Serializable
|
||||
|
||||
getter filename : String
|
||||
getter content : String
|
||||
getter raw_url : String
|
||||
|
||||
def initialize(@filename : String, @content : String, @raw_url : String)
|
||||
end
|
||||
|
||||
def href
|
||||
uri = URI.parse(raw_url)
|
||||
uri.host = GIST_HOST
|
||||
path_and_file_anchor = path_and_file_anchor(uri)
|
||||
uri.path = path_and_file_anchor.path
|
||||
uri.fragment = path_and_file_anchor.file_anchor
|
||||
uri.to_s
|
||||
end
|
||||
|
||||
private def path_and_file_anchor(uri : URI)
|
||||
path_parts = uri.path.split("/")
|
||||
PathAndFileAnchor.new(
|
||||
path: [path_parts[1], path_parts[2]].join("/"),
|
||||
filename: path_parts[-1]
|
||||
)
|
||||
end
|
||||
|
||||
class PathAndFileAnchor
|
||||
getter file_anchor : String
|
||||
getter path : String
|
||||
|
||||
def initialize(@path : String, filename : String)
|
||||
@file_anchor = "file-" + filename.tr(" ", "-").tr(".", "-")
|
||||
end
|
||||
end
|
||||
end
|
||||
|
||||
class MissingGistFile
|
||||
GIST_HOST_AND_SCHEME = "https://#{GIST_HOST}"
|
||||
|
||||
def initialize(@id : String, @filename : String?)
|
||||
end
|
||||
|
||||
def content
|
||||
<<-TEXT
|
||||
Gist file missing.
|
||||
Click on filename to go to gist.
|
||||
TEXT
|
||||
end
|
||||
|
||||
def href
|
||||
GIST_HOST_AND_SCHEME + "/#{@id}"
|
||||
end
|
||||
|
||||
def filename
|
||||
@filename || "Unknown filename"
|
||||
end
|
||||
|
||||
def ==(other : MissingGistFile)
|
||||
other.filename == filename && other.href == href
|
||||
end
|
||||
end
|
||||
|
||||
class RateLimitedGistFile
|
||||
GIST_HOST_AND_SCHEME = "https://#{GIST_HOST}"
|
||||
|
||||
def initialize(@id : String, @filename : String?)
|
||||
end
|
||||
|
||||
def content
|
||||
<<-TEXT
|
||||
Can't fetch gist.
|
||||
GitHub rate limit reached.
|
||||
Click on filename to go to gist.
|
||||
TEXT
|
||||
end
|
||||
|
||||
def href
|
||||
GIST_HOST_AND_SCHEME + "/#{@id}"
|
||||
end
|
||||
|
||||
def filename
|
||||
@filename || "Unknown filename"
|
||||
end
|
||||
|
||||
def ==(other : RateLimitedGistFile)
|
||||
other.filename == filename && other.href == href
|
||||
end
|
||||
end
|
30
src/models/gist_params.cr
Normal file
30
src/models/gist_params.cr
Normal file
|
@ -0,0 +1,30 @@
|
|||
class GistParams
|
||||
class MissingGistId < Exception
|
||||
end
|
||||
|
||||
GIST_ID_REGEX = /[a-f\d]+$/i
|
||||
|
||||
getter id : String
|
||||
getter filename : String?
|
||||
|
||||
def self.extract_from_url(href : String)
|
||||
uri = URI.parse(href)
|
||||
maybe_id = Monads::Try(Regex::MatchData)
|
||||
.new(->{ uri.path.match(GIST_ID_REGEX) })
|
||||
.to_maybe
|
||||
.fmap(->(matches : Regex::MatchData) { matches[0] })
|
||||
case maybe_id
|
||||
in Monads::Just
|
||||
id = maybe_id.value!
|
||||
in Monads::Nothing, Monads::Maybe
|
||||
raise MissingGistId.new(href)
|
||||
end
|
||||
|
||||
filename = uri.query_params["file"]?
|
||||
|
||||
new(id: id, filename: filename)
|
||||
end
|
||||
|
||||
def initialize(@id : String, @filename : String?)
|
||||
end
|
||||
end
|
|
@ -217,15 +217,21 @@ module Nodes
|
|||
end
|
||||
|
||||
class GithubGist
|
||||
def initialize(@href : String)
|
||||
getter gist_store : GistStore | RateLimitedGistStore
|
||||
|
||||
def initialize(@href : String, @gist_store : GistStore | RateLimitedGistStore)
|
||||
end
|
||||
|
||||
def src
|
||||
"#{@href}.js"
|
||||
def files : Array(GistFile) | Array(MissingGistFile) | Array(RateLimitedGistFile)
|
||||
gist_store.get_gist_files(params.id, params.filename)
|
||||
end
|
||||
|
||||
private def params
|
||||
GistParams.extract_from_url(@href)
|
||||
end
|
||||
|
||||
def ==(other : GithubGist)
|
||||
other.src == src
|
||||
other.gist_store == gist_store
|
||||
end
|
||||
|
||||
def empty?
|
||||
|
|
|
@ -38,6 +38,16 @@ class PostResponse
|
|||
property iframe : IFrame?
|
||||
property layout : String?
|
||||
property metadata : Metadata?
|
||||
|
||||
def initialize(
|
||||
@text : String?,
|
||||
@type : ParagraphType,
|
||||
@markups : Array(Markup),
|
||||
@iframe : IFrame?,
|
||||
@layout : String?,
|
||||
@metadata : Metadata?
|
||||
)
|
||||
end
|
||||
end
|
||||
|
||||
enum ParagraphType
|
||||
|
@ -80,6 +90,9 @@ class PostResponse
|
|||
|
||||
class IFrame < Base
|
||||
property mediaResource : MediaResource
|
||||
|
||||
def initialize(@mediaResource : MediaResource)
|
||||
end
|
||||
end
|
||||
|
||||
class MediaResource < Base
|
||||
|
@ -87,6 +100,14 @@ class PostResponse
|
|||
property iframeSrc : String
|
||||
property iframeWidth : Int32
|
||||
property iframeHeight : Int32
|
||||
|
||||
def initialize(
|
||||
@href : String,
|
||||
@iframeSrc : String,
|
||||
@iframeWidth : Int32,
|
||||
@iframeHeight : Int32
|
||||
)
|
||||
end
|
||||
end
|
||||
|
||||
class Metadata < Base
|
||||
|
|
|
@ -1,3 +1,3 @@
|
|||
module Scribe
|
||||
VERSION = "2022-01-08"
|
||||
VERSION = "2022-01-23"
|
||||
end
|
||||
|
|
Loading…
Reference in a new issue