[XAR2] New implementation of Blocklayout compiler
- From: Jonathan Linowes <jonathan (at) parkerhill.com>
- Date: Fri, 3 Nov 2006 14:31:55 -0500
>
<
&
On Nov 3, 2006, at 2:01 PM, Marcel van der Boom wrote:
> Jason wrote:
>> The available entities (the ones people will use) should be
>> defined in the
>> DTD for XHTML. Could we just pull those in?
>>
> We can, but that would be a bit wasteful (because there are LOTS for
> xhtml for example) We have to see how this one develops. To get going
> at least, i've been using ( i think two in all of core ) numeric
> entities. (  ) being the 99% one, the other i already
> forgot.
>
>> If someone used £ in a template say, and the output page was
>> XHTML
>> UTF-8, then what would (or should) appear in the output page?
>> Would it be
>> 'ï' or £? My guess would be the former.
>>
> Oh my, there are not many people who can answer this correctly and
> precise, there is no single correct answer either, there are so many
> factors involved.
>
> Let me say what i know, lots of gaps probably.
>
> If the input xml contains £ and assuming it is defined, either
> explictly (in the xsl later on) or by a reference to a DOCTYPE
> somewhere. Let's say it is defined as Ӓ (meaning unicode
> character 1234 ) that's it, input wise. XSLT doesnt need to know more
> (nor that it is actually a pound sign) For the input, for example:
> € € and â are *exactly* the same. (lets hope that all
> comes through ok :-) )
>
> XSLT does its work based on its XSL transformation and gets character
> 1234 in. What it does with that from now on, becomes dependent on a
> number of things and the tools used. One thing it needs a least is the
> definition of £ either deliverd by the doctype of the input or
> defined in the transformation itself.
>
> The transform as such just puts character number 1234 in the result
> document, done. (assuming the template matches go through and all
> that)
>
> The last bit of the transform is (most likely) serialising the result
> document into a stream of bytes, like an XML document in some
> encoding. The output document may or may not contain a doctype, over
> which you have some control in XSL. This doesn not change what bytes
> are put into the output, the encoding decides that.
>
> That doc in turn is then sent off to its destination (say, a browser)
> which renders the bytes into something we can look at. We all know the
> last bit is very different for different browsers even if given the
> exact same set of bytes.
>
> Now the fun begins:
> - if character 1234 isnt available in the character table the browser
> uses -> boom!
> - if the character is available, but your font doesnt occupy it ->
> boom!
> - if the encoding of the xml document is out of reach of your browser
> --> boom!
> - if character 1234 is not in the doctype or there is no doctype, it
> cant replace &1234; with anything possibly more comfortable.
>
> Lets say all of the above is taken care of, then what is actually
> displayed when you look at a textual representation of the document
> (which in itself is a new transformation) depends again on your tool.
>
> - it can be Ӓ (browser leaves it alone)
> - it can be £ (browser looked it up in documents doctype)
> - it can be &blah; (browser looked it up, result doctype defined it
> differently)
>
> the best way to cope with entities is not to pay attention too _long_
> in my experience. They dont exist, everything is a character number
> and entities are just labels "at that given time". Bit of a
> "schrÃdingers cat" situation. :-) As soon as you look, it's something
> else :-)
>
> In practice, i think most browsers would shouw you £ or Ӓ
> in their source, more likely if the doctype is one of the w3 doctypes
> (forgot the link)
>
> I've been to my neck in xml for a lot of years and i often get it
> wrong, still. :-)
>
> hope this helps.
>
> marcel
> _______________________________________________
> Xaraya_devel mailing list
> Xaraya_devel@xaraya.com
> http://xaraya.com/mailman/listinfo/xaraya_devel
_______________________________________________
Xaraya_devel mailing list
Xaraya_devel@xaraya.com
http://xaraya.com/mailman/listinfo/xaraya_devel