Computers and people: which has more accurate formatting?

Computers and people: which has more accurate formatting?

Author
Discussion

Globulator

Original Poster:

13,841 posts

232 months

Wednesday 13th February 2008
quotequote all
Yes, it's about formatting and HTML.

HTML and XML use matched tags, a <begin> tag, and a </begin> ending tag.

On forums like this, you can type in:

 [pic]picture.jpg[/quote] 


Which PH will mis-interpret.
So my puzzle is why it does this?
Why not ignore what is in the closing tag entirely?
The closing tag content is unreliable, irrelevant and unnecessary.

PH is not the only forum doing this, but really it's a bug that should no longer exist, in fact you can write entire websites without closing tag content. It is a hangover from HTML and IMO needs to be dropped thumbup

In addition, a computer can also count much better than a person, and so it can add closing tags as required, preventing incorrect computer programming by a contributor messing up the look of a page or preview.

It's all down to how much technical skill you demand from users, versus how much technical skill you actually need them to have, two very different things biggrin

Edited by Mr Will to stop breaking everyones quotes! wink

Edited by Mr Will on Wednesday 13th February 11:06

dilbert

7,741 posts

232 months

Wednesday 13th February 2008
quotequote all
Globulator said:
Yes, it's about formatting and HTML.

HTML and XML use matched tags, a <begin> tag, and a </begin> ending tag.

On forums like this, you can type in:

 [pic]picture.jpg[/quote] 


Which PH will mis-interpret.
So my puzzle is why it does this?
Why not ignore what is in the closing tag entirely?
The closing tag content is unreliable, irrelevant and unnecessary.

PH is not the only forum doing this, but really it's a bug that should no longer exist, in fact you can write entire websites without closing tag content. It is a hangover from HTML and IMO needs to be dropped thumbup

In addition, a computer can also count much better than a person, and so it can add closing tags as required, preventing incorrect computer programming by a contributor messing up the look of a page or preview.

It's all down to how much technical skill you demand from users, versus how much technical skill you actually need them to have, two very different things biggrin
HTML does not have a matched tag requirement. That's why XML was invented. It's easier to parse. In XML all end tags are compulsory, in HTML end tags may be one of;

Forbidden
Optional
Compulsory

HTML and XML are both a subset of SGML, with similar constraints on tag elements, but with the specific difference that HTML inherits the SGML end tag flexibility, where XML does not.

The constaints in XML are so severe that even something like the IMG tag, where an end tag does not make sense, must show a slash at the end of the tag declaration.

If you want to validate HTML to the HTML DTD, you can use the W3C validator. IF you want to correct dodgey HTML you could use my SGML state based parser with my validation extension.

The trouble with correcting dodgey HTML is that if it is dodgey, the way it renders in a browser is ambiguous. It is therefore impossible to guarantee that automatically corrected HTML will render the same before and after correction, without human intervention.

The real problem is that IE will accept poor HTML and still render reasonably. People assume that because it renders OK, it's actually good HTML.

That XML is stricter is good because it forces people to write more compliant web pages. The sad part is that SGML is more flexible and less long winded to write, but few people can read it because there arent many parsers. The bigger the drive for XML for internet use, the less people will recognise the benefits of SGML.

Edited by dilbert on Wednesday 13th February 10:59

zaktoo

805 posts

208 months

Wednesday 13th February 2008
quotequote all
The only way I can see to solve it is not to have the software ignore tags which need to be there, but for PH to provide a WYSIWYG inline editor for composing replies and posts. That can generate correctly formatted content without requiring too much knowledge from the user.

In fact, if anyone is reading this who thinks it a good idea, I'll put my hand up to code such a thing...

Globulator

Original Poster:

13,841 posts

232 months

Wednesday 13th February 2008
quotequote all
zaktoo said:
The only way I can see to solve it is not to have the software ignore tags which need to be there, but for PH to provide a WYSIWYG inline editor for composing replies and posts. That can generate correctly formatted content without requiring too much knowledge from the user.

In fact, if anyone is reading this who thinks it a good idea, I'll put my hand up to code such a thing...
Dilberts quote above really proves my point doesn't it biggrin
No closing tag content is needed at all, you just need the opening tag, the closing tag needs to exist but it's content is entirely redundant and sometimes misleading.

Dilbert:
HTML does not always mandate closing tags true, I merely stated that the forum-text formatting only has [opening] and [/closing] tags that hopefully match because of the coders looking at the HTML and not thinking it through.
You only need a simple stack to follow which tag you are on.

zaktoo

805 posts

208 months

Wednesday 13th February 2008
quotequote all
Globulator said:
HTML does not always mandate closing tags true, I merely stated that the forum-text formatting only has [opening] and [/closing] tags that hopefully match because of the coders looking at the HTML and not thinking it through.
You only need a simple stack to follow which tag you are on.
Not so. How do you know which bit of the stack you're on? Consider:
For Example said:
some text bold text some other text
how could you possibly know where the bold bit ended?

It gets worse so:
Another example said:
text bold textbold italic textitalic text only more normal text

dilbert

7,741 posts

232 months

Wednesday 13th February 2008
quotequote all
Globulator said:
zaktoo said:
The only way I can see to solve it is not to have the software ignore tags which need to be there, but for PH to provide a WYSIWYG inline editor for composing replies and posts. That can generate correctly formatted content without requiring too much knowledge from the user.

In fact, if anyone is reading this who thinks it a good idea, I'll put my hand up to code such a thing...
Dilberts quote above really proves my point doesn't it biggrin
No closing tag content is needed at all, you just need the opening tag, the closing tag needs to exist but it's content is entirely redundant and sometimes misleading.

Dilbert:
HTML does not always mandate closing tags true, I merely stated that the forum-text formatting only has [opening] and [/closing] tags that hopefully match because of the coders looking at the HTML and not thinking it through.
You only need a simple stack to follow which tag you are on.
I wouldn't say it's that easy.

From a parser perspective, managing a stack which is not the execution stack is mandatory for a task like this. Although it's just a stack, it's definitely not a pipe! The relationship between the execution stack and the managed stack is esoteric.

It requires significant consideration to get it strong, and it's not as fast as simply knowing that there will, without question be an end tag, or some indication that no end tag is present.

The point is that where a compulsory end tag is present or absolutely known to be absent, the execution stack and the managed stack can be the same. No processing has to be done to establish the relationship between the execution stack and the managed stack.

Certainly if you have a big website on your hands such detail is probably more than you have time for. Equally, for a reasonable sum, I could lend my talent and code base to assist with any such percieved need!!!

smile

Mr Will

13,719 posts

207 months

Wednesday 13th February 2008
quotequote all
Globulator said:
zaktoo said:
The only way I can see to solve it is not to have the software ignore tags which need to be there, but for PH to provide a WYSIWYG inline editor for composing replies and posts. That can generate correctly formatted content without requiring too much knowledge from the user.

In fact, if anyone is reading this who thinks it a good idea, I'll put my hand up to code such a thing...
Dilberts quote above really proves my point doesn't it biggrin
No closing tag content is needed at all, you just need the opening tag, the closing tag needs to exist but it's content is entirely redundant and sometimes misleading.

Dilbert:
HTML does not always mandate closing tags true, I merely stated that the forum-text formatting only has [opening] and [/closing] tags that hopefully match because of the coders looking at the HTML and not thinking it through.
You only need a simple stack to follow which tag you are on.
What would you suggest instead? Every formatting code (that i can think of anyway!) has some kind of marker to tell the parser where the formatting should end, and all have their flaws.

It is genuinely interesting to hear what people would prefer.

dilbert

7,741 posts

232 months

Wednesday 13th February 2008
quotequote all
Mr Will said:
Globulator said:
zaktoo said:
The only way I can see to solve it is not to have the software ignore tags which need to be there, but for PH to provide a WYSIWYG inline editor for composing replies and posts. That can generate correctly formatted content without requiring too much knowledge from the user.

In fact, if anyone is reading this who thinks it a good idea, I'll put my hand up to code such a thing...
Dilberts quote above really proves my point doesn't it biggrin
No closing tag content is needed at all, you just need the opening tag, the closing tag needs to exist but it's content is entirely redundant and sometimes misleading.

Dilbert:
HTML does not always mandate closing tags true, I merely stated that the forum-text formatting only has [opening] and [/closing] tags that hopefully match because of the coders looking at the HTML and not thinking it through.
You only need a simple stack to follow which tag you are on.
What would you suggest instead? Every formatting code (that i can think of anyway!) has some kind of marker to tell the parser where the formatting should end, and all have their flaws.

It is genuinely interesting to hear what people would prefer.
I have a feeling that the OP was suggesting some sort of method to put HTML direct into the post edit box, and have it appear as one would expect on the forum.

Although I already have a solution that I would be keen to sell, I am impartial, because I think that the existing system is perfectly adequate. There are reasons for having such a capability, but I don't think these are they.

Irrespective, there is a significant drive toward XML web pages, on the wider web. With that in mind I think most people would be overwhelmed with a step toward a more complex system. One can present the drive to XML as a step forward, but personally (really only) I think the OP is right in that optional end tags are better. I can only assume that he too is reasonably technical, and those that think like this are a very small proportion of people in general.

Edited by dilbert on Wednesday 13th February 11:34

Globulator

Original Poster:

13,841 posts

232 months

Wednesday 13th February 2008
quotequote all
dilbert said:
The point is that where a compulsory end tag is present or absolutely known to be absent, the execution stack and the managed stack can be the same. No processing has to be done to establish the relationship between the execution stack and the managed stack.
I'm not sure how the execution stack comes into it - sorry my post was unclear.
Using a simple stack to keep track of what is on the page is not only easy, but I've proven it can be done on a huge scale: ALL CONTENT on This Website is generated by a simple forum text system that uses empty braces to terminate tags. Wot I wrote.

For instance, an italic item will be [i]this is italic[].

I agree that you cannot do something like:
<b>bold <i>and italic</b> or just italic</i>

- but no one ever does that. To get the same effect you would just use:
[b]bold [i]and italic[][][i] or just italic[].

One hidden advantage is the huge speed at which you can bash out content with. You never have to remember which tag, use the backslash key or type unnecessary characters, try typing this yourself:

[+1]This is big text [b]in bold[][]

- fast eh?

To maintain backward compatibility you can just treat anything with a / in it (any closing tag) as an empty tag. I think you will struggle to find a single post or article that needs those closing tags to be filled..

dilbert

7,741 posts

232 months

Wednesday 13th February 2008
quotequote all
Globulator said:
dilbert said:
The point is that where a compulsory end tag is present or absolutely known to be absent, the execution stack and the managed stack can be the same. No processing has to be done to establish the relationship between the execution stack and the managed stack.
I'm not sure how the execution stack comes into it - sorry my post was unclear.
Using a simple stack to keep track of what is on the page is not only easy, but I've proven it can be done on a huge scale: ALL CONTENT on This Website is generated by a simple forum text system that uses empty braces to terminate tags. Wot I wrote.

For instance, an italic item will be [i]this is italic[].

I agree that you cannot do something like:
<b>bold <i>and italic</b> or just italic</i>

- but no one ever does that. To get the same effect you would just use:
[b]bold [i]and italic[][][i] or just italic[].

One hidden advantage is the huge speed at which you can bash out content with. You never have to remember which tag, use the backslash key or type unnecessary characters, try typing this yourself:

[+1]This is big text [b]in bold[][]

- fast eh?

To maintain backward compatibility you can just treat anything with a / in it (any closing tag) as an empty tag. I think you will struggle to find a single post or article that needs those closing tags to be filled..
Hmm, interesting.....

By execution stack, I was merely trying to make a distinction between an artificially created stack (possibly somewhere on the heap), and the "stack frame", i.e. the place where variables are passed between functions, and variables with function scope are stored.

What you're suggesting seems like a nice idea, but the question arises about what happens when someone does the thing that you say nobody ever does. I agree, it's sweet all the time they don't, but when they do..... What then?

Whilst it's easy to see in a short little two liner, where the problem has come from, in a monster page, it's much more difficult. I would agree that all you have to do is look for the first error in the rendered output, but there are lots of circumstances where this is less than ideal.

If the source is automatically generated from a number of places, you can't easily figure out if the thing you did is ok, if someone lese made an error in a part that they produced earlier.

SGML does not have this limitation. End tags are named. A compulsory end tag will close any ammount of child tags, and levels of child tags, in order to match it's opening tag.

Once the problem has occured, with a scheme such as you propose, it's very difficult to pick things up again. Once the context is burst, the sense is gone. SGML is resilient.

I accept that your suggestion is fast. Faster than SGML, potentially easier to deal with than XML.

The thing is that code inherently depends on certainty to be reliable. I would suggest that once a mistake allows the chaos to be let in, something you didn't expect to fail will.



Mr Will

13,719 posts

207 months

Wednesday 13th February 2008
quotequote all
Globulator said:
Using a simple stack to keep track of what is on the page is not only easy, but I've proven it can be done on a huge scale: ALL CONTENT on This Website is generated by a simple forum text system that uses empty braces to terminate tags. Wot I wrote.

For instance, an italic item will be [i]this is italic[].

I agree that you cannot do something like:
<b>bold <i>and italic</b> or just italic</i>

- but no one ever does that. To get the same effect you would just use:
[b]bold [i]and italic[][][i] or just italic[].

One hidden advantage is the huge speed at which you can bash out content with. You never have to remember which tag, use the backslash key or type unnecessary characters, try typing this yourself:

[+1]This is big text [b]in bold[][]

- fast eh?

To maintain backward compatibility you can just treat anything with a / in it (any closing tag) as an empty tag. I think you will struggle to find a single post or article that needs those closing tags to be filled..
The problem with this system (aside from not being able to overlap tags) is the lack of clarity and the potential for large knock on effects in the case of any misplaced/mistyped tags.

For example a misplaced quote tag on this site will mess up the quotes in the post but nothing else, whereas an incorrect closing tag in your system could potentially mess up all the formatting. It would then also be harder to fix/debug due to the lack of clarity of which tags are meant to relate to which other tags.

If I was implementing a system like yours, I would be tempted to scrap closing tags all together and go for something more like this:

 [b=this text should be bold] 


the '[' character would denote that everything upto the '=' is a tag, and the ']' marks the end of the formatting.

It would still have the same weaknesses but would be even quicker to type and much easier to read, plus bracket matching would make it easier to debug

Out of interest, have you ever tried programming in python?

Globulator

Original Poster:

13,841 posts

232 months

Wednesday 13th February 2008
quotequote all
dilbert said:
What you're suggesting seems like a nice idea, but the question arises about what happens when someone does the thing that you say nobody ever does. I agree, it's sweet all the time they don't, but when they do..... What then?
Well forum text is completely under the control of the parser, so this is not a problem at all (unless the parser code is buggy).

If someone messes up and tries
[b]bold [i]and italic[] or just italic[]
instead of
[b]bold [i]and italic[][][i] or just italic[]

Then he just gets this: bold and italic or just italic

The effect (and whole concept in fact) is that the parser is filling in the closing tag content for the user, based on a simple stack - for this statement the stack goes

[push b]bold [push i]and italic[pop (i)][pop (b)][push i] or just italic[pop (i)]

dilbert said:
Whilst it's easy to see in a short little two liner, where the problem has come from, in a monster page, it's much more difficult. I would agree that all you have to do is look for the first error in the rendered output, but there are lots of circumstances where this is less than ideal.
It works fine for huge pages, but you have a good point about errors. My point is that a) that problem already exists, and is made worse by users using the wrong tag, and b) due to the stack, you ALWAYS know how to closeout so you can always guarantee perfect HTML, even if the formatting is sometimes not what the user wanted. No parser can be psychic though biggrin

dilbert said:
If the source is automatically generated from a number of places, you can't easily figure out if the thing you did is ok, if someone lese made an error in a part that they produced earlier.
Actually it is dead easy to spot - the format errors stand out on the page for you to see.

dilbert said:
SGML does not have this limitation. End tags are named. A compulsory end tag will close any ammount of child tags, and levels of child tags, in order to match it's opening tag.

Once the problem has occured, with a scheme such as you propose, it's very difficult to pick things up again. Once the context is burst, the sense is gone. SGML is resilient.
SGML is only resilient if the input deck is not corrupted - but saying this, PH does not do any error checking of incorrect ending tags anyway. [b]Does it[/i] biggrin

Globulator

Original Poster:

13,841 posts

232 months

Wednesday 13th February 2008
quotequote all
Mr Will said:
If I was implementing a system like yours, I would be tempted to scrap closing tags all together and go for something more like this:

 [b=this text should be bold] 


the '[' character would denote that everything upto the '=' is a tag, and the ']' marks the end of the formatting.

It would still have the same weaknesses but would be even quicker to type and much easier to read, plus bracket matching would make it easier to debug
Actually I do use that for atomic items, for instance a picture is [Ppicture.jpg] or [Fhttp://somesite/flash/widget.swf]. I think for general items you would get into nesting complications of losing ones place on big pages, for instance an indented paragraph ([I]paragraph[]) or tables etc.

The only changes I'm really suggesting with PH is a) just to ignore the closing tag content, and to allow an empty tag to be treated as a closing tag, and b) to tidy up after if someone misses any closing tags (i.e. while (pop(stack)) close_html(stack))

Mr Will said:
Out of interest, have you ever tried programming in python?
Yes I have, I have to say that I do like brackets, as in PHP and C. Perl of course enforces brackets, whereas C and PHP have the friendly concept of statements smile

I can get used to Python, but in Vi brackets make navigation so much easier smile

dilbert

7,741 posts

232 months

Wednesday 13th February 2008
quotequote all
Globulator said:
dilbert said:
What you're suggesting seems like a nice idea, but the question arises about what happens when someone does the thing that you say nobody ever does. I agree, it's sweet all the time they don't, but when they do..... What then?
Well forum text is completely under the control of the parser, so this is not a problem at all (unless the parser code is buggy).

If someone messes up and tries
[b]bold [i]and italic[] or just italic[]
instead of
[b]bold [i]and italic[][][i] or just italic[]

Then he just gets this: bold and italic or just italic

The effect (and whole concept in fact) is that the parser is filling in the closing tag content for the user, based on a simple stack - for this statement the stack goes

[push b]bold [push i]and italic[pop (i)][pop (b)][push i] or just italic[pop (i)]

dilbert said:
Whilst it's easy to see in a short little two liner, where the problem has come from, in a monster page, it's much more difficult. I would agree that all you have to do is look for the first error in the rendered output, but there are lots of circumstances where this is less than ideal.
It works fine for huge pages, but you have a good point about errors. My point is that a) that problem already exists, and is made worse by users using the wrong tag, and b) due to the stack, you ALWAYS know how to closeout so you can always guarantee perfect HTML, even if the formatting is sometimes not what the user wanted. No parser can be psychic though biggrin

dilbert said:
If the source is automatically generated from a number of places, you can't easily figure out if the thing you did is ok, if someone lese made an error in a part that they produced earlier.
Actually it is dead easy to spot - the format errors stand out on the page for you to see.

dilbert said:
SGML does not have this limitation. End tags are named. A compulsory end tag will close any ammount of child tags, and levels of child tags, in order to match it's opening tag.

Once the problem has occured, with a scheme such as you propose, it's very difficult to pick things up again. Once the context is burst, the sense is gone. SGML is resilient.
SGML is only resilient if the input deck is not corrupted - but saying this, PH does not do any error checking of incorrect ending tags anyway. [b]Does it[/i] biggrin
I guess it depends on what you mean by "Input Deck".

I mean, the rules I came up with for recovery, took some time to arrive at. They work on generic SGML, to any dtd, but if you use a specific dtd, where the end tag status is known, they make a better job of recovery.

Without a DTD my parser has two modes, which are broadly equivalent to the XML browser spec and the HTML spec, although it doesn't look at the tag element names. I call them "strict" and "loose" but they have no connection to the W3C definition of strict and loose. So essentially without a DTD it does the best it can to figure out if the context of a closing tag is that of forbidden, optional or compulsory.

In those cases where a child tag is closed by a parent, it maintains a stack of tags which are "uncertain", and tries to match them with subsequent matching end tags.

Obviously if you have a DTD, then child tag status is always known, and it can make a more accurate job of determining which parent owns which child. Ovbviously this gives it the capacity to behave gracefully if someone uses any tag in any context, which is not a part of a DTD as provided.

The XML generic mode can also take a DTD, but it is not essential to the precision of the parse because the end tag status is always known. The DTD is primarily used there for validation, which is not a part of the parse.

I just looked and for the "loose" generic spec uses 32 descrete ten input rules, and "strict" uses about fifteen eight input rules. The loose spec, with a dtd does not need to use soem of the rules, although they will always come into play if there is an error in the source.

I can reliably parse the big "gassing station" page into objects in about 150ms, on a single thread of my 2.4GHz Quadcore. I find that amazon, is one of the worst pages known to man for compliance, but the parser generally will not mush the page. Typically I get unwanted space on a non compliant page, which after an awful lot of frustration, I find satisfactory. (I wasn't going to spend anymore effort!!!)

I'm assuming that "Cute Studio" is the name of a product that's not quite finished yet? Certainly, if you are going to produce a client server system, with the kind of functionality that I would imagine goes with that name, we ought to talk, because I've already done an awful lot of that. In addition I don't have anything quite like what you seem to be offering.

Collaboration is a powerful tool.


Edited by dilbert on Wednesday 13th February 13:40

Mr Will

13,719 posts

207 months

Wednesday 13th February 2008
quotequote all
Globulator said:
Actually I do use that for atomic items, for instance a picture is [Ppicture.jpg] or [Fhttp://somesite/flash/widget.swf]. I think for general items you would get into nesting complications of losing ones place on big pages, for instance an indented paragraph ([I]paragraph[]) or tables etc.
The nesting problem is no different in either case, except that with my system the computer will be able to recognise the corresponding start and end tags more easily (most text editors already feature a bracket pairing function)

It would also help enfornce the concept of pairing in the users mind, brackets instinctively come in pairs whereas remembering to but in the [] might be hard for novice users.

This doesn't work for a forum posting system, but when used to code a website if you ignore linebreaks then you can format paragraphs like this:

[indent=
Some paragraph text, which might be [b= bold] or not.
]

Which i think is clear and easy to read.

Globulator said:
Mr Will said:
Out of interest, have you ever tried programming in python?
Yes I have, I have to say that I do like brackets, as in PHP and C. Perl of course enforces brackets, whereas C and PHP have the friendly concept of statements smile

I can get used to Python, but in Vi brackets make navigation so much easier smile
I always got on well with python and thought the indentation based scope was a very interesting idea. Not sure that i would want to write anything too big in it though!

Edited by Mr Will on Wednesday 13th February 13:25

Globulator

Original Poster:

13,841 posts

232 months

Wednesday 13th February 2008
quotequote all
CuteStudio Ltd is the company, the product itself is called 'Silk', but it is not really a client-server system, I think that title is too grand, it is more a presentation format biggrin

Docs are Here for browsing

I'll just stuck the reference on so you can read it, it doesn't as a rule tend to live there however.

It is really a type of wiki, but expands the idea to encompass the entire site, so you can change layout, style etc on a level by level basis - which you will see as you view the module docs. Currently it's finished enough to present stuff with and so I can whack stuff up onto the net, but does not have any remote editing facility. The main reason I wrote it was in fact for lo-hassle presentation - which is seems to do well for me.

It's the type of thing for intranets and stuff I guess, but the main feature was the ability to hack simple text files to create rich pages and content, you'll notice as you browse I think.

To see the source for any page, at the URL just change /index.php for /page.txt (and /layout.txt for containers - but there is only one at the top level IIRC). Local /style.css files vary the style as one descends into the site.

The Text module is the one that contains the CuteText format that uses no closing tag biggrin
You can also see that text source BTW if you look at the corresponding /page.txt

It also does stuff like showing the size of downloadable files automatically etc, which is verified by the site loading tools offline. In fact to add the reference part then I had to:
1 Copy that directory over
2 Click on 'build site'
3 Click on 'verify site'
4 Click on 'sync upload site'

i.e. this post took considerably longer biggrin

dilbert

7,741 posts

232 months

Wednesday 13th February 2008
quotequote all
I think thats the thing that makes it interesting. I understand that you're dealing with the presentation stuff. TBH it doesn't interest me too much (to have to code it)because it's just a different way of doing the same thing. I would always choose what I would see as the most generic way (SGML) because it maximises the potential for reuse.

The thing is though that there are a lot of people out there who want to make things simpler, quicker, easier, and possibly just different but these things are important too.

I suppose the app I posted here, is similar to frontpage, but as a client server solution that can be expensive, depending on what you want to do. Also what I'm doing doesnt have the breadth of something like frontpage. I see that app as being a simple console to the server reather than the environment in which you would want to develop.

Frontpage et al, will often allow you to work with images. I can put a browser window on the app, and I can also offer quite a bit of image processing capability. I can deal with gif, jpeg, and bmp, but I don't yet have png or tiff input capability, and I don't really want to write an image editing application.

Unlike frontpage this solution does not lead you into .NET, which I don't like, and I think a lot of others don't like either. Explicity, this solution is designed to work with php.

There are a whole raft of reasons to make such a capability web based, but there are also a whole load of reasons why you might not want to do that. Certainly I don't think I would want to have to use a web interface to update my pages every day.

One of the important capabilities that the offers is the ability to upload a page and it's references to the server automatically. The same is true for downloads.

The server is content managed, in the sense that it supports permalinking. One of the capabilities that the console has is to translate a working page on your disk, into a validated, permalinked page on the server.

The server achieves this through a similar scheme to your own for metadata. The server can produce little CSV files which automatically reflect the site structure, and are easily used to create a page frame in PHP.

On windows the server is ISAPI, so it's quite quick. All that's necassary on the "web server" is a single php file that calls into the "document server". By this means you only need to use FTP once to put the stub on the "webserver".

I'm critically concious that Linux is a (very) cheap alternative to windows, and a linux version of the server is definitely planned, for use with apache. But such things will depend wholly on any interest (or otherwise) that I get.

The client is obviously windows, and I don't see that as such an important problem, but I'm hoping that Wine is going to be my friend there. Ultimately though I guess I'll always be a 'dozer! (Mac ooer - what's that?)

Crazily this was only intended as a capability to support my upcoming website, but asd time has progressed it's looked more like it ought to be a product. More oddly some of the other stuff I'm working on is related to signal processing, like your sound studio, but purely IIR/FIR filters, design, and test.

Edited by dilbert on Wednesday 13th February 14:59

The Excession

11,669 posts

251 months

Wednesday 13th February 2008
quotequote all
dilbert said:
Globulator said:
dilbert said:
The point is that where a compulsory end tag is present or absolutely known to be absent, the execution stack and the managed stack can be the same. No processing has to be done to establish the relationship between the execution stack and the managed stack.
I'm not sure how the execution stack comes into it - sorry my post was unclear.
Using a simple stack to keep track of what is on the page is not only easy, but I've proven it can be done on a huge scale: ALL CONTENT on This Website is generated by a simple forum text system that uses empty braces to terminate tags. Wot I wrote.

For instance, an italic item will be [i]this is italic[].

I agree that you cannot do something like:
<b>bold <i>and italic</b> or just italic</i>

- but no one ever does that. To get the same effect you would just use:
[b]bold [i]and italic[][][i] or just italic[].

One hidden advantage is the huge speed at which you can bash out content with. You never have to remember which tag, use the backslash key or type unnecessary characters, try typing this yourself:

[+1]This is big text [b]in bold[][]

- fast eh?

To maintain backward compatibility you can just treat anything with a / in it (any closing tag) as an empty tag. I think you will struggle to find a single post or article that needs those closing tags to be filled..
Hmm, interesting.....

By execution stack, I was merely trying to make a distinction between an artificially created stack (possibly somewhere on the heap), and the "stack frame", i.e. the place where variables are passed between functions, and variables with function scope are stored.

What you're suggesting seems like a nice idea, but the question arises about what happens when someone does the thing that you say nobody ever does. I agree, it's sweet all the time they don't, but when they do..... What then?

Whilst it's easy to see in a short little two liner, where the problem has come from, in a monster page, it's much more difficult. I would agree that all you have to do is look for the first error in the rendered output, but there are lots of circumstances where this is less than ideal.

If the source is automatically generated from a number of places, you can't easily figure out if the thing you did is ok, if someone lese made an error in a part that they produced earlier.

SGML does not have this limitation. End tags are named. A compulsory end tag will close any ammount of child tags, and levels of child tags, in order to match it's opening tag.

Once the problem has occured, with a scheme such as you propose, it's very difficult to pick things up again. Once the context is burst, the sense is gone. SGML is resilient.

I accept that your suggestion is fast. Faster than SGML, potentially easier to deal with than XML.

The thing is that code inherently depends on certainty to be reliable. I would suggest that once a mistake allows the chaos to be let in, something you didn't expect to fail will.
I've not read the whole thread yet, but looking forward to the responses.

At this point, I'd say I'm with Dilbert, and I agree it is interesting.

I'd rather see tags closed according to the tag identifier, for one it completeley negates the need for pushing and popping stuff off a stack.

My preference would be to maintain 'state' be that rendering e.g. font styles, bold italic, or what ever.

As you read tags like this you just flip states in your state table that tell your code what it is supposed to be doing. Chances are, even if you are doing this on a stack, you've still got a state table somewhere telling your code 'how to behave'

The only advantage I can really see with proposing a stack based end tag system is that is might shorten the input stream a little - i.e. a few less characters to transport and read.

Anyhoo, off to read the rest of the thread now and see if I've completely missed the plot, but so far I'm firmly of the opinion that tags should be closed according to their type.

Globulator

Original Poster:

13,841 posts

232 months

Wednesday 13th February 2008
quotequote all
dilbert said:
I think thats the thing that makes it interesting. I understand that you're dealing with the presentation stuff. TBH it doesn't interest me too much (to have to code it)because it's just a different way of doing the same thing. I would always choose what I would see as the most generic way (SGML) because it maximises the potential for reuse.

The thing is though that there are a lot of people out there who want to make things simpler, quicker, easier, and possibly just different but these things are important too.

I suppose the app I posted here, is similar to frontpage, but as a client server solution that can be expensive, depending on what you want to do. Also what I'm doing doesnt have the breadth of something like frontpage. I see that app as being a simple console to the server reather than the environment in which you would want to develop.

Frontpage et al, will often allow you to work with images. I can put a browser window on the app, and I can also offer quite a bit of image processing capability. I can deal with gif, jpeg, and bmp, but I don't yet have png or tiff input capability, and I don't really want to write an image editing application.

Unlike frontpage this solution does not lead you into .NET, which I don't like, and I think a lot of others don't like either. Explicity, this solution is designed to work with php.

There are a whole raft of reasons to make such a capability web based, but there are also a whole load of reasons why you might not want to do that. Certainly I don't think I would want to have to use a web interface to update my pages every day.

One of the important capabilities that the offers is the ability to upload a page and it's references to the server automatically. The same is true for downloads.

The server is content managed, in the sense that it supports permalinking. One of the capabilities that the console has is to translate a working page on your disk, into a validated, permalinked page on the server.

The server achieves this through a similar scheme to your own for metadata. The server can produce little CSV files which automatically reflect the site structure, and are easily used to create a page frame in PHP.

On windows the server is ISAPI, so it's quite quick. All that's necassary on the "web server" is a single php file that calls into the "document server". By this means you only need to use FTP once to put the stub on the "webserver".
I'm not sure I 100% understand what your app is, would I be right is guessing it is a client-server document access system, via http, akin to the web but using a different protocol to the usual HTML/CSS?

If it powers a website I'd be interested to have a look-see!

My CuteText renders straight to XHTML/CSS by PHP routines but I may end up using it to power some help pages in apps too. I see you are using C++, I'm a bit hopeless with that so I tend to stick to C.

dilbert said:
I'm critically concious that Linux is a (very) cheap alternative to windows, and a linux version of the server is definitely planned, for use with apache. But such things will depend wholly on any interest (or otherwise) that I get.

The client is obviously windows, and I don't see that as such an important problem, but I'm hoping that Wine is going to be my friend there. Ultimately though I guess I'll always be a 'dozer! (Mac ooer - what's that?)
I think Wine does a good job for Linux, especially if you compile your sources in with it, I write software for Mac/Linux/Windows which is simple with text code using MinGW, but for GUI one has to fork at the graphics level into Win32 or X-windows (or Mac native, but I haven't gotten around tuit yet smile

Funnily enough for graphics Win32 is really very good whereas Xwindows is completely pants (rotated text anyone smile), but for program development I always prefer the Unix command line and tools. I'm not sure Wine will allow MAC access though, although I'd be surprised if no one had done it. It can't be rocket science smile

One reason I chose W3C HTML/CSS/Javascript was to get browser compatibility, just so I could avoid getting into GUI coding for presentation stuff, as experience showed that all the cross platform porting and testing often took far longer than anything else. Therefore ALL my documentation etc just tends to be straight HTML/CSS - so I can pupblish it on the web or scoop it up into wget or a site reader and stick it onto a CD or zip file. Some things you simply can't do that way however!

dilbert said:
Crazily this was only intended as a capability to support my upcoming website, but as time has progressed it's looked more like it ought to be a product. More oddly some of the other stuff I'm working on is related to signal processing, like your sound studio, but purely IIR/FIR filters, design, and test.
I'm more of an analog electronics person myself, despite dabbling in digital, although I've got to integrate a FIR filter soon into DeClip to allow it to upsample to different bit rates.
And yes, soon I'll be gui-ing for that too, no rest for the wicked biggrin

ginettag27

6,300 posts

270 months

Wednesday 13th February 2008
quotequote all
Globulator said:
For instance, an italic item will be [i]this is italic[].

I agree that you cannot do something like:
<b>bold <i>and italic</b> or just italic</i>

- but no one ever does that. To get the same effect you would just use:
[b]bold [i]and italic[][][i] or just italic[].
consider the following :

<b> this is in bold<i> this is in italic and bold</>this is in?</><i> this is in italic</>

How do you know what to render "this is in?" in? is it in bold or italic or neither?

The only way around that would be to make compulsory start tags, which would be a pain and a backwards step imo.

Alternatively why have a fullstop at the end of a typed sentence? Why don't you just say that a typed sentence ends when you come across a capital letter?

Apologies if I've got the wrong end of the stick...