PHP - is there a better way to do this?

PHP - is there a better way to do this?

Author
Discussion

TonyRPH

Original Poster:

12,971 posts

168 months

Monday 27th March 2023
quotequote all
My (self written) PHP site reads pages from a database.

I want to insert <br> and/or <p></p> tags however using nl2br results in w3c validation fail, because all table tags for example are punctuated with <br>

I did initially write a function to ignore a block of text between two fake tags e.g. <!--start-->(.+)<!--finish> however I couldn't get this to work with multiple start/finish tags in the same page.

So I came up with This solution (pastebin link)

However it feels wrong doing it like this, so is there a better way to do it?

maffski

1,868 posts

159 months

Monday 27th March 2023
quotequote all
The kludge might be to do the initial replacement and then replace the resulting broken tags with corrected ones.

So if your first replacement is creating </td><br></tr> or something do a replacement to convert it back.

The better solution is to think 'why am I allowing some of the text to be HTML and some to be plain' and stop it happening.


king arthur

6,562 posts

261 months

Monday 27th March 2023
quotequote all
Stop putting line breaks between your table tags then!

TonyRPH

Original Poster:

12,971 posts

168 months

Monday 27th March 2023
quotequote all
The table is just plain text in the database.

It's stored like this snippet; So apart from writing it all on one long line, I can't think how to store it without line breaks?

I could of course remove all the line breaks when reading it out of the db however that'll affect all the plain text as well.

Alternately, I could store it without line breaks, but that would make any (direct in db) incidental edits fun. And I would still need to restore the line breaks for text paragraphs.

text said:
<table class="center">
<tr>
<th>OPAMP</th><th>THD</th><th>Notes</th>
</tr>
<tr>
<td>LM4562NA</td><td>0.00027%</td><td></td>
</tr>
</table>

mw88

1,457 posts

111 months

Monday 27th March 2023
quotequote all
Not the answer you're looking for but if the site works and looks correct in all major browsers I'd ignore the W3C Validator personally! Can't remember the last time I checked a site using it.



king arthur

6,562 posts

261 months

Monday 27th March 2023
quotequote all
TonyRPH said:
The table is just plain text in the database.

It's stored like this snippet; So apart from writing it all on one long line, I can't think how to store it without line breaks?

I could of course remove all the line breaks when reading it out of the db however that'll affect all the plain text as well.

Alternately, I could store it without line breaks, but that would make any (direct in db) incidental edits fun. And I would still need to restore the line breaks for text paragraphs.

text said:
<table class="center">
<tr>
<th>OPAMP</th><th>THD</th><th>Notes</th>
</tr>
<tr>
<td>LM4562NA</td><td>0.00027%</td><td></td>
</tr>
</table>
I think you need to have a little rethink about what it is you're doing because either the data you have in your database is supposed to be HTML or it's not.

If it's HTML then why aren't the desired <br> tags already in it? And if it's plain text then why does it have HTML table tags in it?

I would separate them out and have a table template into which you put the text you want. Then you can format the text how you want before fitting it into the template.

If that doesn't make sense then some sort of regex to pull the text out from between the <td> tags might be a solution, you need one that returns everything between the two tags, and another that replaces the same with a token that can then be replaced with the formatted text. That's if the text you are trying to format is only ever found inside table cells. If not then....I dunno.

TonyRPH

Original Poster:

12,971 posts

168 months

Monday 27th March 2023
quotequote all
mw88 said:
Not the answer you're looking for but if the site works and looks correct in all major browsers I'd ignore the W3C Validator personally! Can't remember the last time I checked a site using it.
Yes the site functions perfectly in all the browsers I tested. I'm just a bit of a perfectionist.

king arthur said:
<stuff>
Ah I see what you mean now.

The pages stored in the db are mixed plain text but where there are links, images or tables etc. then of course it's html.

I guess I could create a separate database table for the tables I have (one of my other sites works like this, but that is solely tabular information).

Rather than apply <br> tags to everything during editing I'll just stick with my kludge. Performance doesn't seem to suffer so I'm happy with that.




budgie smuggler

5,379 posts

159 months

Monday 27th March 2023
quotequote all
TonyRPH said:
Ah I see what you mean now.

The pages stored in the db are mixed plain text but where there are links, images or tables etc. then of course it's html.

I guess I could create a separate database table for the tables I have (one of my other sites works like this, but that is solely tabular information).

Rather than apply <br> tags to everything during editing I'll just stick with my kludge. Performance doesn't seem to suffer so I'm happy with that.
Then check it for tags and only nl2br when there are none?

TonyRPH

Original Poster:

12,971 posts

168 months

Monday 27th March 2023
quotequote all
budgie smuggler said:
Then check it for tags and only nl2br when there are none?
That's what my solution (pastebin link) is doing however I thought there might be a better way.

king arthur

6,562 posts

261 months

Monday 27th March 2023
quotequote all
If you know that your plain text part never has any angle brackets at the end of a line in it then you could do something like

$string = str_replace(">\r\n", ">", $string);

That would remove the line breaks after all opening and closing tags where you don't want a <br> to be inserted.

Then do the $string = nl2br($string). Should work.

Edited by king arthur on Monday 27th March 14:41

TonyRPH

Original Poster:

12,971 posts

168 months

Tuesday 28th March 2023
quotequote all
king arthur said:
If you know that your plain text part never has any angle brackets at the end of a line in it then you could do something like

$string = str_replace(">\r\n", ">", $string);

That would remove the line breaks after all opening and closing tags where you don't want a <br> to be inserted.

Then do the $string = nl2br($string). Should work.

Edited by king arthur on Monday 27th March 14:41
Thanks I tried this but as expected it caught too many tags, so I ended up with this (which is just a variant on my original idea).

This still catches "<table class="blah">" but I can live with that for now, and I'll format the tables some other way, or create specific database tables and use a for loop to read them out - or perhaps just use pure CSS tables (one of my pages already has this).

code said:
function nl2pY($string) {
$ends_with = array("<table>", "</div>\n", "</th>\n", "<tr>", "</tr>\n", "</td>\n");
$stuff = '';
$lines=explode("\n", $string);
foreach ($lines as $line) {
$Xline = str_replace(array("\r\n", "\r", "\n"), "\n", $line);
$match = (str_replace($ends_with, '', $Xline) != $Xline);
if ($match == "1") {
$stuff .= "{$Xline}";
} else {
$stuff .= "{$Xline}<br>\n";
}
}
return $stuff;

budgie smuggler

5,379 posts

159 months

Tuesday 28th March 2023
quotequote all
TonyRPH said:
budgie smuggler said:
Then check it for tags and only nl2br when there are none?
That's what my solution (pastebin link) is doing however I thought there might be a better way.
Beg your pardon, I remember that I read that code, must have disregarded it before replying banghead

I don't think there's anything wrong with that approach, it's simple and it works: if you have HTML, leave it as is, otherwise turn it into HTML.

However an easier way to check if your text is html is to run strip_tags() over it, then compare it to the original. That way you don't have to look for all those specific tags. I guess it probably would be slightly faster as well. JustbBe aware that it might also assume text that looks like a tag is a tag e.g. if somebody wrote <lol>

E.g.
code said:
function nl2pX($string) {
if ($string !== strip_tags($string)) {
// it's html, just return it
return $string;
} else {
// it's text
return nl2br($string)
}
}
eta: f knows how you're supposed to use the code tags, it keeps trying to turn everything into wiki links


Edited by budgie smuggler on Tuesday 28th March 10:43

king arthur

6,562 posts

261 months

Tuesday 28th March 2023
quotequote all
budgie smuggler said:
TonyRPH said:
budgie smuggler said:
Then check it for tags and only nl2br when there are none?
That's what my solution (pastebin link) is doing however I thought there might be a better way.
Beg your pardon, I remember that I read that code, must have disregarded it before replying banghead

I don't think there's anything wrong with that approach, it's simple and it works: if you have HTML, leave it as is, otherwise turn it into HTML.

However an easier way to check if your text is html is to run strip_tags() over it, then compare it to the original. That way you don't have to look for all those specific tags. I guess it probably would be slightly faster as well. JustbBe aware that it might also assume text that looks like a tag is a tag e.g. if somebody wrote <lol>

E.g.
code said:
function nl2pX($string) {
if ($string !== strip_tags($string)) {
// it's html, just return it
return $string;
} else {
// it's text
return nl2br($string)
}
}
eta: f knows how you're supposed to use the code tags, it keeps trying to turn everything into wiki links


Edited by budgie smuggler on Tuesday 28th March 10:43
I suspect that would ultimately have the same effect as my idea, whereas I think Tony only wants to stop <br> tags being added after certain other tags, namely the ones in his array.

In which case you could do

$ends_with = array("<table>", "</div>\n", "</th>\n", "<tr>", "</tr>\n", "</td>\n");
foreach($ends_with as $tag) {
$string = str_replace($tag . "\r\n", $tag, $string);
}
$string = nl2br($string);

Which I think would be quicker and clearer.

(It might be str_replace($tag . "\n", $tag, $string), it depends on how the data was created)

But really, you should not be relying on <br> tags to space out <p> and <div> elements, you should use CSS to create the margins and padding you want, <br> should only be for blocks of text in a paragraph.

TonyRPH

Original Poster:

12,971 posts

168 months

Tuesday 28th March 2023
quotequote all
king arthur said:
<snip>

But really, you should not be relying on <br> tags to space out <p> and <div> elements, you should use CSS to create the margins and padding you want, <br> should only be for blocks of text in a paragraph.
The site (in my profile) is not that sophisticated, and largely consists of a few pages of text, some with tables and images and some links.

It's just a hobby site, I'm not a web dev (that much is evident!) - however I am proud to say that it is written from the ground up and doesn't depend on any CMS.

All the CSS is done by hand, no bootstrap etc. and no PHP frameworks.




TonyRPH

Original Poster:

12,971 posts

168 months

Tuesday 28th March 2023
quotequote all
Well, that was a good day's work.

All tables are now stored in the database (as SQL tables and not HTML) and are dynamically created when read out, so no more issues with tags.

Link to Pastebin code (dynamic table generation)