scribe/spec/classes/paragraph_converter_spec.cr

332 lines
8.2 KiB
Crystal
Raw Normal View History

First step rendering a page The API responds with a bunch of paragraphs which the client converts into Paragraph objects. This turns the paragraphs in a PostResponse's Paragraph objects into the form needed to render them on a page. This includes converting flat list elements into list elements nested by a UL. And adding a limited markups along the way. The array of paragraphs is passed to a recursive function. The function takes the first paragraph and either wraps the (marked up) contents in a container tag (like Paragraph or Heading3), and then moves onto the next tag. If it finds a list, it starts parsing the next paragraphs as a list instead. Originally, this was implemented like so: ```crystal paragraph = paragraphs.shift if list? convert_list([paragraph] + paragraphs) end ``` However, passing the `paragraphs` after adding it to the already shifted `paragraph` creates a new object. This means `paragraphs` won't be mutated and once the list is parsed, it starts with the next element of the list. Instead, the element is `shift`ed inside each converter. ```crystal if paragraphs.first == list? convert_list(paragraphs) end def convert_list(paragraphs) paragraph = paragraphs.shift # ... end ``` When rendering, there is an Empty and Container object. These represent a kind of "null object" for both leafs and parent objects respectively. They should never actually render. Emptys are filtered out, and Containers are never created explicitly but this will make the types pass. IFrames are a bit of a special case. Each IFrame has custom data on it that this system would need to be aware of. For now, instead of trying to parse the seemingly large number of iframe variations and dealing with embedded iframe problems, this will just keep track of the source page URL and send the user there with a link.
2021-05-16 14:14:25 -04:00
require "../spec_helper"
include Nodes
describe ParagraphConverter do
it "converts a simple structure with no markups" do
paragraphs = Array(PostResponse::Paragraph).from_json <<-JSON
[
{
"text": "Title",
"type": "H3",
"markups": [],
"iframe": null,
"layout": null,
"metadata": null
}
]
JSON
expected = [Heading3.new(children: [Text.new(content: "Title")] of Child)]
result = ParagraphConverter.new.convert(paragraphs)
result.should eq expected
end
it "converts a simple structure with a markup" do
paragraphs = Array(PostResponse::Paragraph).from_json <<-JSON
[
{
"text": "inline code",
"type": "P",
"markups": [
{
"name": null,
"title": null,
"type": "CODE",
"href": null,
"start": 7,
"end": 11,
"rel": null,
"anchorType": null
}
],
"iframe": null,
"layout": null,
"metadata": null
}
]
JSON
expected = [
Paragraph.new(children: [
Text.new(content: "inline "),
Code.new(children: [Text.new(content: "code")] of Child),
2021-08-08 14:23:38 -04:00
] of Child),
First step rendering a page The API responds with a bunch of paragraphs which the client converts into Paragraph objects. This turns the paragraphs in a PostResponse's Paragraph objects into the form needed to render them on a page. This includes converting flat list elements into list elements nested by a UL. And adding a limited markups along the way. The array of paragraphs is passed to a recursive function. The function takes the first paragraph and either wraps the (marked up) contents in a container tag (like Paragraph or Heading3), and then moves onto the next tag. If it finds a list, it starts parsing the next paragraphs as a list instead. Originally, this was implemented like so: ```crystal paragraph = paragraphs.shift if list? convert_list([paragraph] + paragraphs) end ``` However, passing the `paragraphs` after adding it to the already shifted `paragraph` creates a new object. This means `paragraphs` won't be mutated and once the list is parsed, it starts with the next element of the list. Instead, the element is `shift`ed inside each converter. ```crystal if paragraphs.first == list? convert_list(paragraphs) end def convert_list(paragraphs) paragraph = paragraphs.shift # ... end ``` When rendering, there is an Empty and Container object. These represent a kind of "null object" for both leafs and parent objects respectively. They should never actually render. Emptys are filtered out, and Containers are never created explicitly but this will make the types pass. IFrames are a bit of a special case. Each IFrame has custom data on it that this system would need to be aware of. For now, instead of trying to parse the seemingly large number of iframe variations and dealing with embedded iframe problems, this will just keep track of the source page URL and send the user there with a link.
2021-05-16 14:14:25 -04:00
]
result = ParagraphConverter.new.convert(paragraphs)
result.should eq expected
end
it "groups <ul> list items into one list" do
paragraphs = Array(PostResponse::Paragraph).from_json <<-JSON
[
{
"text": "One",
"type": "ULI",
"markups": [],
"iframe": null,
"layout": null,
"metadata": null
},
{
"text": "Two",
"type": "ULI",
"markups": [],
"iframe": null,
"layout": null,
"metadata": null
},
{
"text": "Not a list item",
"type": "P",
"markups": [],
"iframe": null,
"layout": null,
"metadata": null
}
]
JSON
expected = [
UnorderedList.new(children: [
ListItem.new(children: [Text.new(content: "One")] of Child),
ListItem.new(children: [Text.new(content: "Two")] of Child),
] of Child),
Paragraph.new(children: [Text.new(content: "Not a list item")] of Child),
]
result = ParagraphConverter.new.convert(paragraphs)
result.should eq expected
end
it "groups <ol> list items into one list" do
paragraphs = Array(PostResponse::Paragraph).from_json <<-JSON
[
{
"text": "One",
"type": "OLI",
"markups": [],
"iframe": null,
"layout": null,
"metadata": null
},
{
"text": "Two",
"type": "OLI",
"markups": [],
"iframe": null,
"layout": null,
"metadata": null
},
{
"text": "Not a list item",
"type": "P",
"markups": [],
"iframe": null,
"layout": null,
"metadata": null
}
]
JSON
expected = [
OrderedList.new(children: [
ListItem.new(children: [Text.new(content: "One")] of Child),
ListItem.new(children: [Text.new(content: "Two")] of Child),
] of Child),
Paragraph.new(children: [Text.new(content: "Not a list item")] of Child),
]
result = ParagraphConverter.new.convert(paragraphs)
result.should eq expected
end
it "converts an IMG to a Figure" do
paragraph = PostResponse::Paragraph.from_json <<-JSON
{
"text": "Image by someuser",
"type": "IMG",
"markups": [
{
"title": "",
"type": "A",
"href": "https://unsplash.com/@someuser",
"userId": null,
"start": 9,
"end": 17,
"rel": "photo-creator",
"anchorType": "LINK"
}
],
"iframe": null,
"layout": "INSET_CENTER",
"metadata": {
"id": "image.png",
"originalWidth": 1000,
"originalHeight": 600
}
}
JSON
expected = [
Figure.new(children: [
Image.new(src: "image.png", originalWidth: 1000, originalHeight: 600),
FigureCaption.new(children: [
Text.new("Image by "),
Anchor.new(
children: [Text.new("someuser")] of Child,
href: "https://unsplash.com/@someuser"
),
] of Child),
] of Child),
]
result = ParagraphConverter.new.convert([paragraph])
result.should eq expected
end
First step rendering a page The API responds with a bunch of paragraphs which the client converts into Paragraph objects. This turns the paragraphs in a PostResponse's Paragraph objects into the form needed to render them on a page. This includes converting flat list elements into list elements nested by a UL. And adding a limited markups along the way. The array of paragraphs is passed to a recursive function. The function takes the first paragraph and either wraps the (marked up) contents in a container tag (like Paragraph or Heading3), and then moves onto the next tag. If it finds a list, it starts parsing the next paragraphs as a list instead. Originally, this was implemented like so: ```crystal paragraph = paragraphs.shift if list? convert_list([paragraph] + paragraphs) end ``` However, passing the `paragraphs` after adding it to the already shifted `paragraph` creates a new object. This means `paragraphs` won't be mutated and once the list is parsed, it starts with the next element of the list. Instead, the element is `shift`ed inside each converter. ```crystal if paragraphs.first == list? convert_list(paragraphs) end def convert_list(paragraphs) paragraph = paragraphs.shift # ... end ``` When rendering, there is an Empty and Container object. These represent a kind of "null object" for both leafs and parent objects respectively. They should never actually render. Emptys are filtered out, and Containers are never created explicitly but this will make the types pass. IFrames are a bit of a special case. Each IFrame has custom data on it that this system would need to be aware of. For now, instead of trying to parse the seemingly large number of iframe variations and dealing with embedded iframe problems, this will just keep track of the source page URL and send the user there with a link.
2021-05-16 14:14:25 -04:00
it "converts all the tags" do
paragraphs = Array(PostResponse::Paragraph).from_json <<-JSON
[
{
"text": "text",
"type": "H3",
"markups": [],
"iframe": null,
"layout": null,
"metadata": null
},
{
"text": "text",
"type": "H4",
"markups": [],
"iframe": null,
"layout": null,
"metadata": null
},
{
"text": "text",
"type": "P",
"markups": [],
"iframe": null,
"layout": null,
"metadata": null
},
{
"text": "text",
"type": "PRE",
"markups": [],
"iframe": null,
"layout": null,
"metadata": null
},
{
"text": "text",
"type": "BQ",
"markups": [],
"iframe": null,
"layout": null,
"metadata": null
},
{
"text": "text",
"type": "PQ",
"markups": [],
"iframe": null,
"layout": null,
"metadata": null
},
First step rendering a page The API responds with a bunch of paragraphs which the client converts into Paragraph objects. This turns the paragraphs in a PostResponse's Paragraph objects into the form needed to render them on a page. This includes converting flat list elements into list elements nested by a UL. And adding a limited markups along the way. The array of paragraphs is passed to a recursive function. The function takes the first paragraph and either wraps the (marked up) contents in a container tag (like Paragraph or Heading3), and then moves onto the next tag. If it finds a list, it starts parsing the next paragraphs as a list instead. Originally, this was implemented like so: ```crystal paragraph = paragraphs.shift if list? convert_list([paragraph] + paragraphs) end ``` However, passing the `paragraphs` after adding it to the already shifted `paragraph` creates a new object. This means `paragraphs` won't be mutated and once the list is parsed, it starts with the next element of the list. Instead, the element is `shift`ed inside each converter. ```crystal if paragraphs.first == list? convert_list(paragraphs) end def convert_list(paragraphs) paragraph = paragraphs.shift # ... end ``` When rendering, there is an Empty and Container object. These represent a kind of "null object" for both leafs and parent objects respectively. They should never actually render. Emptys are filtered out, and Containers are never created explicitly but this will make the types pass. IFrames are a bit of a special case. Each IFrame has custom data on it that this system would need to be aware of. For now, instead of trying to parse the seemingly large number of iframe variations and dealing with embedded iframe problems, this will just keep track of the source page URL and send the user there with a link.
2021-05-16 14:14:25 -04:00
{
"text": "text",
"type": "ULI",
"markups": [],
"iframe": null,
"layout": null,
"metadata": null
},
{
"text": "text",
"type": "OLI",
"markups": [],
"iframe": null,
"layout": null,
"metadata": null
},
{
"text": "text",
"type": "IMG",
"markups": [],
"iframe": null,
"layout": null,
"metadata": {
"id": "1*miroimage.png",
"originalWidth": 618,
"originalHeight": 682
}
},
{
"text": "",
"type": "IFRAME",
"markups": [],
"iframe": {
"mediaResource": {
Render embedded content PostResponse::Paragraph's that are of type IFRAME have extra data in the iframe attribute to specify what's in the iframe. Not all data is the same, however. I've identified three types and am using the new EmbeddedConverter class to convert them: * EmbeddedContent, the full iframe experience * GithubGist, because medium or github treat embeds differently for whatever reason * EmbeddedLink, the old style, just a link to the content. Effectively a fallback The size of the original iframe is also specified as an attribute. This code resizes it. The resizing is determined by figuring out the width/height ratio and setting the width to 800. EmbeddedContent can be displayed if we have an embed.ly url, which most iframe response data has. GitHub gists are a notable exception. Gists instead can be embedded simply by taking the gist URL and attaching .js to the end. That becomes the iframe's src attribute. The PostResponse::Paragraph's iframe attribute is nillable. Previous code used lots of if-statements with variable bindings to work with the possible nil values: ```crystal if foo = obj.nillable_value # obj.nillable_value was not nil and foo contains the value else # obj.nillable_value was nil so do something else end ``` See https://crystal-lang.org/reference/syntax_and_semantics/if_var.html for more info In the EmbeddedConverter the monads library has been introduced to get rid of at least one level of nillability. This wraps values in Maybe which allows for a cleaner interface: ```crystal Monads::Try(Value).new(->{ obj.nillable_value }) .to_maybe .fmap(->(value: Value) { # do something with value }) .value_or(# value was nil, do something else) ``` This worked to get the iframe attribute from a Paragraph: ```crystal Monads::Try(PostResponse::IFrame).new(->{ paragraph.iframe }) .to_maybe .fmap(->(iframe : PostResponse::IFrame) { # iframe is not nil! }) .fmap(#and so on) .value_or(Empty.new) ``` iframe only has one attribute: mediaResource which contains the iframe data. That was used to determine one of the three types above. Finally, Tufte.css has options for iframes. They mostly look good except for tweets which are too small and weirdly in the center of the page which actually looks off-center. That's for another day though.
2021-09-13 13:27:52 -04:00
"href": "https://example.com",
"iframeSrc": "",
"iframeWidth": 0,
"iframeHeight": 0
First step rendering a page The API responds with a bunch of paragraphs which the client converts into Paragraph objects. This turns the paragraphs in a PostResponse's Paragraph objects into the form needed to render them on a page. This includes converting flat list elements into list elements nested by a UL. And adding a limited markups along the way. The array of paragraphs is passed to a recursive function. The function takes the first paragraph and either wraps the (marked up) contents in a container tag (like Paragraph or Heading3), and then moves onto the next tag. If it finds a list, it starts parsing the next paragraphs as a list instead. Originally, this was implemented like so: ```crystal paragraph = paragraphs.shift if list? convert_list([paragraph] + paragraphs) end ``` However, passing the `paragraphs` after adding it to the already shifted `paragraph` creates a new object. This means `paragraphs` won't be mutated and once the list is parsed, it starts with the next element of the list. Instead, the element is `shift`ed inside each converter. ```crystal if paragraphs.first == list? convert_list(paragraphs) end def convert_list(paragraphs) paragraph = paragraphs.shift # ... end ``` When rendering, there is an Empty and Container object. These represent a kind of "null object" for both leafs and parent objects respectively. They should never actually render. Emptys are filtered out, and Containers are never created explicitly but this will make the types pass. IFrames are a bit of a special case. Each IFrame has custom data on it that this system would need to be aware of. For now, instead of trying to parse the seemingly large number of iframe variations and dealing with embedded iframe problems, this will just keep track of the source page URL and send the user there with a link.
2021-05-16 14:14:25 -04:00
}
},
"layout": null,
"metadata": null
2021-09-07 21:13:28 -04:00
},
{
"text": "Mixtape",
"type": "MIXTAPE_EMBED",
"href": null,
"layout": null,
"markups": [
{
"title": "https://example.com",
"type": "A",
"href": "https://example.com",
"userId": null,
"start": 0,
"end": 7,
"anchorType": "LINK"
}
],
"iframe": null,
"metadata": null
First step rendering a page The API responds with a bunch of paragraphs which the client converts into Paragraph objects. This turns the paragraphs in a PostResponse's Paragraph objects into the form needed to render them on a page. This includes converting flat list elements into list elements nested by a UL. And adding a limited markups along the way. The array of paragraphs is passed to a recursive function. The function takes the first paragraph and either wraps the (marked up) contents in a container tag (like Paragraph or Heading3), and then moves onto the next tag. If it finds a list, it starts parsing the next paragraphs as a list instead. Originally, this was implemented like so: ```crystal paragraph = paragraphs.shift if list? convert_list([paragraph] + paragraphs) end ``` However, passing the `paragraphs` after adding it to the already shifted `paragraph` creates a new object. This means `paragraphs` won't be mutated and once the list is parsed, it starts with the next element of the list. Instead, the element is `shift`ed inside each converter. ```crystal if paragraphs.first == list? convert_list(paragraphs) end def convert_list(paragraphs) paragraph = paragraphs.shift # ... end ``` When rendering, there is an Empty and Container object. These represent a kind of "null object" for both leafs and parent objects respectively. They should never actually render. Emptys are filtered out, and Containers are never created explicitly but this will make the types pass. IFrames are a bit of a special case. Each IFrame has custom data on it that this system would need to be aware of. For now, instead of trying to parse the seemingly large number of iframe variations and dealing with embedded iframe problems, this will just keep track of the source page URL and send the user there with a link.
2021-05-16 14:14:25 -04:00
}
]
JSON
expected = [
Heading2.new([Text.new("text")] of Child),
First step rendering a page The API responds with a bunch of paragraphs which the client converts into Paragraph objects. This turns the paragraphs in a PostResponse's Paragraph objects into the form needed to render them on a page. This includes converting flat list elements into list elements nested by a UL. And adding a limited markups along the way. The array of paragraphs is passed to a recursive function. The function takes the first paragraph and either wraps the (marked up) contents in a container tag (like Paragraph or Heading3), and then moves onto the next tag. If it finds a list, it starts parsing the next paragraphs as a list instead. Originally, this was implemented like so: ```crystal paragraph = paragraphs.shift if list? convert_list([paragraph] + paragraphs) end ``` However, passing the `paragraphs` after adding it to the already shifted `paragraph` creates a new object. This means `paragraphs` won't be mutated and once the list is parsed, it starts with the next element of the list. Instead, the element is `shift`ed inside each converter. ```crystal if paragraphs.first == list? convert_list(paragraphs) end def convert_list(paragraphs) paragraph = paragraphs.shift # ... end ``` When rendering, there is an Empty and Container object. These represent a kind of "null object" for both leafs and parent objects respectively. They should never actually render. Emptys are filtered out, and Containers are never created explicitly but this will make the types pass. IFrames are a bit of a special case. Each IFrame has custom data on it that this system would need to be aware of. For now, instead of trying to parse the seemingly large number of iframe variations and dealing with embedded iframe problems, this will just keep track of the source page URL and send the user there with a link.
2021-05-16 14:14:25 -04:00
Heading3.new([Text.new("text")] of Child),
Paragraph.new([Text.new("text")] of Child),
Preformatted.new([Text.new("text")] of Child),
BlockQuote.new([Text.new("text")] of Child), # BQ
BlockQuote.new([Text.new("text")] of Child), # PQ
First step rendering a page The API responds with a bunch of paragraphs which the client converts into Paragraph objects. This turns the paragraphs in a PostResponse's Paragraph objects into the form needed to render them on a page. This includes converting flat list elements into list elements nested by a UL. And adding a limited markups along the way. The array of paragraphs is passed to a recursive function. The function takes the first paragraph and either wraps the (marked up) contents in a container tag (like Paragraph or Heading3), and then moves onto the next tag. If it finds a list, it starts parsing the next paragraphs as a list instead. Originally, this was implemented like so: ```crystal paragraph = paragraphs.shift if list? convert_list([paragraph] + paragraphs) end ``` However, passing the `paragraphs` after adding it to the already shifted `paragraph` creates a new object. This means `paragraphs` won't be mutated and once the list is parsed, it starts with the next element of the list. Instead, the element is `shift`ed inside each converter. ```crystal if paragraphs.first == list? convert_list(paragraphs) end def convert_list(paragraphs) paragraph = paragraphs.shift # ... end ``` When rendering, there is an Empty and Container object. These represent a kind of "null object" for both leafs and parent objects respectively. They should never actually render. Emptys are filtered out, and Containers are never created explicitly but this will make the types pass. IFrames are a bit of a special case. Each IFrame has custom data on it that this system would need to be aware of. For now, instead of trying to parse the seemingly large number of iframe variations and dealing with embedded iframe problems, this will just keep track of the source page URL and send the user there with a link.
2021-05-16 14:14:25 -04:00
UnorderedList.new([ListItem.new([Text.new("text")] of Child)] of Child),
OrderedList.new([ListItem.new([Text.new("text")] of Child)] of Child),
Figure.new(children: [
Image.new(src: "1*miroimage.png", originalWidth: 618, originalHeight: 682),
FigureCaption.new(children: [Text.new("text")] of Child),
] of Child),
Render embedded content PostResponse::Paragraph's that are of type IFRAME have extra data in the iframe attribute to specify what's in the iframe. Not all data is the same, however. I've identified three types and am using the new EmbeddedConverter class to convert them: * EmbeddedContent, the full iframe experience * GithubGist, because medium or github treat embeds differently for whatever reason * EmbeddedLink, the old style, just a link to the content. Effectively a fallback The size of the original iframe is also specified as an attribute. This code resizes it. The resizing is determined by figuring out the width/height ratio and setting the width to 800. EmbeddedContent can be displayed if we have an embed.ly url, which most iframe response data has. GitHub gists are a notable exception. Gists instead can be embedded simply by taking the gist URL and attaching .js to the end. That becomes the iframe's src attribute. The PostResponse::Paragraph's iframe attribute is nillable. Previous code used lots of if-statements with variable bindings to work with the possible nil values: ```crystal if foo = obj.nillable_value # obj.nillable_value was not nil and foo contains the value else # obj.nillable_value was nil so do something else end ``` See https://crystal-lang.org/reference/syntax_and_semantics/if_var.html for more info In the EmbeddedConverter the monads library has been introduced to get rid of at least one level of nillability. This wraps values in Maybe which allows for a cleaner interface: ```crystal Monads::Try(Value).new(->{ obj.nillable_value }) .to_maybe .fmap(->(value: Value) { # do something with value }) .value_or(# value was nil, do something else) ``` This worked to get the iframe attribute from a Paragraph: ```crystal Monads::Try(PostResponse::IFrame).new(->{ paragraph.iframe }) .to_maybe .fmap(->(iframe : PostResponse::IFrame) { # iframe is not nil! }) .fmap(#and so on) .value_or(Empty.new) ``` iframe only has one attribute: mediaResource which contains the iframe data. That was used to determine one of the three types above. Finally, Tufte.css has options for iframes. They mostly look good except for tweets which are too small and weirdly in the center of the page which actually looks off-center. That's for another day though.
2021-09-13 13:27:52 -04:00
EmbeddedLink.new(href: "https://example.com"),
2021-09-07 21:13:28 -04:00
MixtapeEmbed.new(children: [
Anchor.new(
children: [Text.new("Mixtape")] of Child,
href: "https://example.com"
),
] of Child),
First step rendering a page The API responds with a bunch of paragraphs which the client converts into Paragraph objects. This turns the paragraphs in a PostResponse's Paragraph objects into the form needed to render them on a page. This includes converting flat list elements into list elements nested by a UL. And adding a limited markups along the way. The array of paragraphs is passed to a recursive function. The function takes the first paragraph and either wraps the (marked up) contents in a container tag (like Paragraph or Heading3), and then moves onto the next tag. If it finds a list, it starts parsing the next paragraphs as a list instead. Originally, this was implemented like so: ```crystal paragraph = paragraphs.shift if list? convert_list([paragraph] + paragraphs) end ``` However, passing the `paragraphs` after adding it to the already shifted `paragraph` creates a new object. This means `paragraphs` won't be mutated and once the list is parsed, it starts with the next element of the list. Instead, the element is `shift`ed inside each converter. ```crystal if paragraphs.first == list? convert_list(paragraphs) end def convert_list(paragraphs) paragraph = paragraphs.shift # ... end ``` When rendering, there is an Empty and Container object. These represent a kind of "null object" for both leafs and parent objects respectively. They should never actually render. Emptys are filtered out, and Containers are never created explicitly but this will make the types pass. IFrames are a bit of a special case. Each IFrame has custom data on it that this system would need to be aware of. For now, instead of trying to parse the seemingly large number of iframe variations and dealing with embedded iframe problems, this will just keep track of the source page URL and send the user there with a link.
2021-05-16 14:14:25 -04:00
]
result = ParagraphConverter.new.convert(paragraphs)
result.should eq expected
end
end