## Open an html saved notebook

4

I have found an html page clearly created using the Save as HTML option of Mathematica. Is there a way to reverse this operation? That is open the html file with Mathematica and render it as a nb file again?

2Without proper specification via Export, by default almost everything important is rasterized so I don't think so. Why not embedded CDF? – Kuba – 2013-12-09T11:26:13.750

@Kuba ok, I didn't realize that not only images but also expression were rasterized. What is "embedded CDF"? (I am not the author of the html page) – Red – 2013-12-09T12:35:02.653

CDF. Usually there are links to source notebooks but if there isn't any maybe you can ask the author? I don't believe text recognition approach will be successful with Mathematica syntax. – Kuba – 2013-12-09T12:40:04.383

@Kuba I don't think he created the webpage himself, he probably wants to run something he found online. Red, is this correct? I've felt the pain of that before, uncopyable rasterized cells are very annoying. – Szabolcs – 2013-12-09T19:57:45.913

@Szabolcs You are right, but I know it, have you read all the comments? :) – Kuba – 2013-12-09T20:00:00.127

@Kuba I was confused by "why not embedded CDF", it sounded like you assumed he created the document. – Szabolcs – 2013-12-09T20:24:18.520

@Szabolcs Yes, it was so. But then OP said "I am not the author of the html page" – Kuba – 2013-12-09T20:28:14.107

## Answers

5

I've created test image with Heike's code from How to create word clouds? and I've posted it... here :):

So let's download it:

pic = Import["http://i.stack.imgur.com/Ni4Kl.png"];


In case of full html you can use Import[....html, "Images"].

TextRecognize[ImageResize[pic, {Automatic, 1100}]]


Not perfect but it's something. It is String so after corrections you can convert it to InputForm.

You can also play with ImageResize, I've choosen 1100 because it gives nice output.

I think that the first operation should be

StringReplace[..., {"»?" -> "#", "f?" -> "#", " . " -> "."}]


:)