javascript - Word Cloud for Other Languages -
i using jasondavies's word cloud project, there problem using persian[farsi] strings , problem here words have overlapping in svg.
this project's output:
what happened farsi words?
as explained on about page project, generator needs retrieve shape of glyph able compute "safe" put other words. page explains process in more detail, here's care for:
- glyphs rendered individually hidden
<canvas>
element. - pixel data retrieved
- bounding boxes derived
- the word cloud generated.
now, critical insight in western (and many other) scripts, glyphs don't change shape based on context often. yes, there such things ligatures, rare, , not necessary script.
in persian, however, glyph shape change based on context. non-persian readers, @ ی , س which, when combined, become یس. yes, last 1 two glyphs!
the algorithm has no problem dealing persian characters, can see hacking demo on page, putting breakpoint after d.code
generated, able modify it:
replacing 1740
, charcode
first persian glyph above, , letting algorithm run, shows beautiful , correct bounding boxes around glyph:
the issue when word cloud rendered, glyph placed in context and... changes shape. generator doesn't know this, though, , continues use old bounding data place other words, creating overlapping witnessed. in addition, there issue around right-to-left handling of text, not help.
i encourage take author of generator directly. project has github page: https://github.com/jasondavies/d3-cloud opening issue there (and maybe referring answer) help!
Comments
Post a Comment