share
Stack OverflowPHP tutorial that is security-, accuracy- and maintainability-conscious?
[+279] [10] bobince
[2010-01-22 17:19:15]
[ php security ]
[ http://stackoverflow.com/questions/2119083/php-tutorial-that-is-security-accuracy-and-maintainability-conscious ] [DELETED]

Colleagues often ask me: “bobince”, they say*, “I want to learn PHP, but I know you're always ranting on about poor code which is full of errors and security holes. That's why I normally don't like talking to you really. But, I'm looking to learn PHP now and I'd like to be able to write good code. Where's a tutorial that will teach me how to do it properly, so my site won't get all hacked up and you won't get all cross at me?”

“Hmm...” I reply, then point and call out “Look over there! A lovely pony!”, and run away.

I do keep looking for a PHP tutorial that isn't awful, full of disastrously terrible practices, but I've yet to find anything I can recommend.

Dear reader, can you perchance help me out? Unlike previous questions, I'm not after a security-specific tutorial for people who can already code in PHP; I'm looking for a good tutorial (sites or books) for those new to the language, that happens to be solid on security and writing readable code. I want the reader to be able to learn PHP properly from the start, not to have to say “well you can learn it here... but then afterwards you'll need to go to this other tutorial to find out about the bad habits and fundamental misunderstandings you've just picked up”.

(*: they don't say that. It is not my real name.)


I want a tutorial that:

  1. Uses HTML-escaping consistently from the start. The very first “Hello, your_inputted_name!” example should be using htmlspecialchars() correctly. This should not be introduced as an afterthought in a separate security chapter. There should be no HTML-injection hole anywhere in the given example code.

  2. Either uses SQL-escaping consistently from the start, or parameterised queries. If SQL-escaping is used it should be correct escaping such as mysql_real_escape_string() if the database is MySQL. I do not want to see addslashes(). There should be no SQL-injection hole anywhere in the given example code.

  3. More generally, the tutorial should understand the problems to do with putting a string inside another string, and treat escaping as a matter of correctness, not merely of security. That is to say there is no reason a user called N'gwale O'Reilly should be prevented from having an apostrophe in their name, or from talking about the HTML <script> tag in their message like what I'm doing now; they merely just need the right form of encoding when they're output.

    The tutorial should explain that when a string goes into a new context it needs an encoding process appropriate for that context, like htmlspecialchars() in HTML output; it should not regard less-than symbols as ‘evil’ and attempt to ‘sanitise’ them away. I don't want to see strip_tags. I absolutely don't want to see misguided ‘security’ measures like looping over the $_GET array removing punctuation characters, or blanket-applying haphazard output-stage escaping to an input stage.

    There is so much bad code and bad examples like this out there, even in learning materials that are supposed to be explicitly about security. As questions on SO have proved it is difficult to fix people's miunderstanding of how and when string escaping needs to happen once they've learned a quick hack ‘solution’ from some misconceived ‘PHP Security’ site or book.

  4. I don't want to see eval(). I don't want to see system(). Nothing good comes of having these in a tutorial!

  5. There should be proper separation of active logic and page markup. I don't mean they have to be kept religiously in different files or using a specialised templating language, but I do at least want the actions up at the top and the page down at the bottom with only display logic inside it. I don't want to see an echo or print hidden inside the guts of some program logic.

  6. Actually I don't really want to see echo/print used at all except as the only thing in an output block. PHP is a templating language, there is no reason to go about creating complex strings of HTML then echoing them. This only encourages the use of unescaped "<p>$var</p>"-style interpolation, and makes it difficult to follow. All markup should be literal markup in the PHP file, not hidden within a string.

  7. Proper use of indentation is essential, both in the HTML tag hierarchy and in the PHP code. Ideally there should be a single hierarchy, using structures like <?php if () { ?> ... <?php } ?> (or the alternative syntax) as if they were well-formed XML. The HTML should itself be in ‘well-formed’ style even if it is not actually XHTML.

  8. Some mention of XSRF [1] would be a nice bonus.


In short, I want a tutorial that teaches one to code something like [with predefined output-escaping function shortcuts]:

<?php
    function m(str) { return "'".mysql_real_escape_string(str)."'"; }
    function h(str) { echo htmlspecialchars(str); }
    function u(str) { echo rawurlencode(str); }

    $result= mysql_query('SELECT * FROM items WHERE category='.m($_POST['category']));
?>
<table>
    <?php while($row= mysql_fetch_assoc($result)) { ?>
        <tr id="row-<?php h($row['id']); ?>">
            <td>
                <?php h($row['title']); ?>
            </td>
            <td class="thing">
                <a href="/view.php?id=<?php u($row['id']); ?>">View</a>
            </td>
        </tr>
    <?php } ?>
</table>

and not like:

<?php $category=$_POST["category"];
$result = mysql_query("SELECT * FROM items WHERE category=$category");
 echo "<table>";
while($row=mysql_fetch_assoc($result))
{$html="<TR id=row{$row[id]}><td class=\"thing\">".$row[title];
$html.="</td><td><a href=\"/view.php?id={$row[id]}\">View</a></td></TR>";
print $html; }
print "</table>"; ?>

which is the sort of Other People's Code I'm fed up of fixing. Is there anything out there I can recommend, in the hope that when I end up looking after the code it doesn't look like this kind of mess?

I know I'm being too prescriptive and everything, but is there anything even close? So far I've yet to find a tutorial without SQL-injection in it, which is just so depressing.

(95) Golly, it looks like you've got a good start to your own tutorial right there! Flesh it out a bit and publish it. - sequoia mcdowell
(10) The point 7 makes the code sooo ugly. Why not teach the concepts of views and controllers from the ground up? - Alix Axel
(5) Why don't you write one yourself? - jamolkhon
(2) @presario: I think he's afraid to start writing a tutorial and ending up with a book. No offense bobince! :) - Alix Axel
(1) Yes, that's pretty much what I'm trying to avoid! I'm not primarily a PHP coder and I don't claim to understand the language in-depth, so I'm not in an ideal place to start writing learning materials even if I had that skill and the time to do it. Surely someone else must have done? - bobince
(1) @bobince: I've been thinking and wouldn't a PHP tutorial with all the concerns you mentioned be somewhat overwhelming to a beginner? One option might be to adapt an otherwise good tutorial with the stuff missing. Maybe the ones from Wikiversity (en.wikiversity.org/wiki/PHP) or Wikibooks (en.wikibooks.org/wiki/Programming:PHP). - Alix Axel
(1) I dunno, I don't think it need be overwhelming to explain that text-in-HTML needs escaping... as long as you introduce that from the start and don't go heavy on the "double quoted string $var interpolation" stuff it should, I think, seem relatively natural. The SQL escaping of course can be substituted with parameterised queries which are conceptually no more difficult to deal with than string slinging. Admittedly the PHP mysqli interface is annoyingly verbose, but that's easily solved with a quick wrapper function (like shortening echo(htmlspecialchars()) to h() above. - bobince
(1) XSRF is a more complicated case to understand, of course, and you need actual additional measures to defeat it instead of just treating strings correctly in the first place like you should have done all along (as in SQL or HTML escaping). That's why I only put it as a bonus point! Given the slowness in tackling SQL and HTML injections, I have little optimism that XSRF attacks will be fixed in my lifetime. - bobince
(1) As for views and controllers... well, I do think that's a bit optimistic, and as I'm not that familiar with PHP I can't really recommend a good choice of MVC library to go with it. Plus what the guys are trying to learn is ‘PHP’, not really ‘PHP-plus-one-particular-framework’, whichever that is. - bobince
(1) Well, I like tutorials from PHPRO, there is one really good tutorial for beginners there. I gave links in my answer. - Richard Knop
(3) Point 4 should include the "error control operator" php.net/manual/en/language.operators.errorcontrol.php - Gerry
(1) There's still an SQL-injection possible in your example: if I post "category=" + encodeURIComponent("unclassified OR 1=1") to your example, the query reads SELECT * FROM items WHERE category=unclassified OR 1=1, retrieving unclassified and all other items from the database. You should enclose those escaped parameters within quotes (so the query will read SELECT * FROM items WHERE category='unclassified OR 1=1' and the complete string will be treated as a single parameter). - Marcel Korpel
(1) @Marcel: err... [scratches head] unless I'm missing something, that's your example code isn't it? :-) m() also shouldn't be using echo. Adjusted appropriately - bobince
Ah, that way. :-D Indeed, for clarity's sake I added those functions and forgot that one could insert the quotes within m(). - Marcel Korpel
This is a nice article, totally out of the box. While most people would have made PHP the base and echo the HTML, you did the exact opposite. I think many people are deterred from doing this because having <php?..?> all over the place makes the code less readable. The second reason is that breaking up code such as the while loop makes it hard to track and maintain. Although proper indentation helps to alleviate things, but having both HTML and PHP with two sets of indentations alternating with each other makes tracking somewhat difficult when there are lines that contain both PHP and HTML. - Question Overflow
(1) @BenHuh I obviously agree that the bad example is worse than the good example, but, even the good one looks a little frightening. However, what I would like is tutorials explaining when and why I should use only php in a file, as opposed to php and html or javascript. Also when to modify the file (when requested by client ie via link click, or build the script serverside then send it over...) - puk
I realize I might get booed for mentioning w3schools, but they have a forum topic asking for suggestions on how to make their PHP tutorial better w3schools.invisionzone.com/index.php?showtopic=4236 - puk
(2) Please write your own article bobince. I would gladly both read and share it with others. :) - Andreas Johansson
which would be nice doing <?php while($row= mysql_fetch_assoc($result)) { ?> <?php } ?> or <?php while($row= mysql_fetch_assoc($result)): ?> <?php endwhile;?> - ianace
[+142] [2010-01-22 20:12:58] Yogesh

Chris Shiflett is author of Essential PHP Security book. This is really good book on security. Check following of his blog links on various topics.

  1. Cross site request forgeries. (XSRF) [1]
  2. Cross Site scripting [2]
  3. SQL Injection [3]
  4. Other article of that author with one session [4]
  5. Good article on SQL injection [5]
  6. Discusses quotes, htmlentities() and htmlspecialchars() [6]
[1] http://shiflett.org/articles/cross-site-request-forgeries
[2] http://shiflett.org/articles/cross-site-scripting
[3] http://shiflett.org/articles/sql-injection
[4] http://shiflett.org/articles
[5] http://simon.net.nz/articles/protecting-mysql-sql-injection-attacks-using-php/
[6] http://www.nyphp.org/PHundamentals/5_Storing-Data-Submitted-Form-Displaying-Database

(4) Someone up-vote him please, I'm out of votes for today. - Alix Axel
Thank you Daniel and Alix. I am new to this site and learning how this whole process works. Thanks for putting all links in one place. Shall I go ahead and delete my other answers links for brevity? Thanks and Regards. - Yogesh
@Yogesh: No problem. Yes, you should. =) Otherwise a mod will. - Alix Axel
@Alix: ok. Thats helpful. I will be more careful next time around. Thanks again. - Yogesh
@Alix: Done, upvoted. But now I need somebody who'll vote for me! :) - Pekka 웃
@Pekka: My request had a reason (see stackoverflow.com/revisions/…) xD you can stop voting now, it's enough. :P - Alix Axel
(16) Yeah, this seems to be decent stuff; however it is still fixing security problems up after the fact, for people who've already learned PHP but learned bad habits. I'm especially looking for a PHP tutorial that doesn't make those mistakes in the first place. @Pekka: just give me a PHP tutorial which doesn't include HTML or SQL injections and I'll happily vote for you! :-) - bobince
@bobince: I meant it differently, I was helping Alix out because he had run out of votes, but I would like to upvote this answer as well. :) - Pekka 웃
1
[+30] [2010-12-21 10:36:50] Halil Özgür

A complete tutorial for beginners?

I saw such a thing in the movie Matrix once; they were able to load a complete chopper piloting program into Trinity's brain in seconds. I think what you want is essentially the same as this. This clashes with the idea "learn a language in 10 years" without which being a proficient coder is hardly possible.

I thought of such an idea once too. But when I started to think about what should I put there in order to make it decent (to me), my brain started to bubble, and I was not even able to plan the first chapter sanely.

Then I realized that in order to satisfy my decency rules, I simply had to put everything I had in there, and with the same organization as in my brain. Even after that, it would be only good as what "good" meant to me at that time. Such a book would be as thick as an encyclopedia and would never be finished, since the tech advances faster than any human being can write.

After that, I saw the value in studentship, master-apprentice system, and devoting time to learning and doing, and especially the virtually unchanged nature of this, through millennia.

While there are many improvements that can be made to books and tutorials, not everything can be put in them. Because as Joel Spolsky pointed out, understanding many basic ingredients of coding and computing requires the essential knowledge of many layers and lower levels of the system like: cpu's, memory, packets, HTTP, etc.

Of course, you can go write directly the most secure and nice version of the code without explaining what does the question mark in the $db->query('SELECT etc FROM tbl WHERE id = ?'); ... mean. But who would read (and even use) it then?

Hence comes the idea of short, well defined and very specific articles and blog posts, written by people you feel far superior than you. The "tutorial" you dream about is made of years of learning the basics and even more years of such tiny bits. But I don't know whether it is writable.


I don't think 3 or 4 items (mysql_real_escape_string, htmlescapespecialchars, indenting) are outside the scope of a beginner tutorial. Most of the other points asked for in the question speak more as to what the tutorial should not contain, not what it should contain. - Billy ONeal
(2) In real life you seldom use those functions. Instead, you use SQL escaping as prepared statements (e.g. ? syntax), and HTML escaping as a say, (maybe automatic) view helper (e.g. $_e($var);). No one can withstand the burden of cluttering the code with this functions, so they either abandon those or switch to the expected syntax. And any tutorial that is not usable -at least partly- in real life is basically useless. What I was trying to convey is that you just can't squeeze years of mentality, learning and experience into a single tutorial. - Halil Özgür
(1) +1, reminds me of the dilbert cartoon where the admin assistant says to Alice, "Teach me how to be an engineer, I don't care if it takes all day." - Nathan Hughes
A note from the future - After years, I've come across a better tutorial: ruby.railstutorial.org/ruby-on-rails-tutorial-book. It's not about PHP and it doesn't focus on security but I appreciate how it endorses good practices from the start. OK, it's not a tutorial, it's a book! (Just what I was trying to say, remember?) - Halil Özgür
2
[+14] [2010-01-22 17:36:53] Alix Axel

The Writing [1] Secure [2] PHP [3] Series [4] by Added Bytes seems like a good place to start on security.

EDIT: I just came across OWASP Ruby on Rails Security Guide [5], I know it's RoR but there's plenty of decent stuff in there that also applies to PHP. There is also the OWASP PHP Top 5 [6] article.

[1] http://www.addedbytes.com/writing-secure-php/writing-secure-php-1/
[2] http://www.addedbytes.com/writing-secure-php/writing-secure-php-2/
[3] http://www.addedbytes.com/writing-secure-php/writing-secure-php-3/
[4] http://www.addedbytes.com/writing-secure-php/writing-secure-php-4/
[5] http://www.lulu.com/items/volume_64/5811000/5811294/3/print/5811294.pdf
[6] https://www.owasp.org/index.php/PHP_Top_5

(1) Yeah, this ain't bad as far as it good (it's obviously not complete yet), but it's not really somewhere a beginner can start. It's correcting common errors; I want a tutorial for not-yet-PHP-coders that avoids the errors from the start. - bobince
(3) @bobince: I think it'll be hard finding only one tutorial that covers all of those topics well. Seriously, maybe you should write one. - Alix Axel
3
[+7] [2010-08-11 20:52:01] samandmoore

I was going to suggest Learning PHP, MySQL, and JavaScript [1] because it is the book I used to learn the language.

It is a very complete text that delves deep into all 3 subjects in a very readable fashion, but I think that the author's methods of avoiding SQL-injection and the like are somewhat primitive (they leave a lot to be desired, but provide a sturdy foundation).

Thus, I have to agree with everyone else here and suggest that you spend a few days/weeks/months putting together a tutorial. I acknowledge your proclamation that you are not an expert on the subject, but the very details of your question imply that you should take on the chore yourself. Perhaps you could develop a wiki-like tutorial that you could write the bulk of and then encourage people to amend and contribute to it.

[1] http://oreilly.com/catalog/9780596157142

4
[+6] [2010-01-22 21:30:50] Grant Palin

There is also php|architect's Guide to PHP Security [1] - it provides broad coverage of PHP-related security topics.

[1] http://rads.stackoverflow.com/amzn/click/0973862106

5
[+6] [2010-09-29 15:04:55] Metropolis

PHP 5 Objects, Patterns, and Practice [1] is the best book I have read on PHP so far. He really does a great job of going over all aspects of good PHP practice.

I can not tell you how well his security is in this book, but I bet it is good enough for a beginner to get a good foundation without writing crap code.

I honestly hate seeing PHP and HTML blocks mixed in with each other. But I also hate seeing echo/prints everywhere for the HTML. Instead I created a PHP HTML class which I use to validate all of my HTML and ensure that the programmer does not write bad HTML code. This also gives me a way to update my HTML as new DOCTYPES come out, and I can escape everything from that class. By using this HTML class I also have all PHP on a page without the PHP/HTML blocks mixed in together.

[1] http://rads.stackoverflow.com/amzn/click/1590593804

(1) Your practice sounds very much like the late Douglas Clifton's Markup Toolkit - Marcel Korpel
Kind of the same concept, but I am using classes for my markup, and I never use global variables (EVER). Plus mine is much more robust. - Metropolis
6
[+4] [2010-01-22 23:04:16] foxed

About a year ago I went through Kevin Skoglunds "PHP with MySQL Essential Training" which was ok but I expect given that it was aimed at complete and utter beginners who would probably struggle with the concepts discussed here it's not what you need. I'm currently working through (slowly ...) his second series - "PHP with MySQL Beyond the Basics". So far I don't think I've seen any mention of html escaping but the database class he worked through definitely had escaping in there from day dot.

Again, gotta bear in mind that I am a beginner - I just thought you might want to review it a bit to see what's been taught so far.

I've been self teaching myself some Git whilst working through this so you can see the source of the project (currently at Chapter 7) here: http://github.com/Mithadriel/LyndaPHP---ImageGallery

Otherwise - good luck in finding what sounds to me like a Holy Grail :)


Should note that these are paid-for video tutorials available from Lynda.com - foxed
Are there any free materials (eg. accompanying code samples) anywhere I can have a look at? Video tutorials might be suitable though I don't so much fancy watching through them myself :-) - bobince
There is an accompanying zip which contains the Exercise Files - effectively the code he's teaching through at each step of each Chapter. Happy to send it over for you to have a browse through. - foxed
Yes, if you can chuck a link to some examples to the address in my profile that'd be great. Thanks! - bobince
7
[+4] [2010-07-07 14:35:50] Richard Knop

What about PHPRO tutorials? I've always liked that website and I learned a lot there. There's also an introduction for PHP/MySQL tutorial:

There are also couple good articles about security, for instance:


(2) It's a case of lots of little fragmented tutorials here though, that don't build on each other, which is not easy for a beginner to navigate through. The MySQL tutorial is full of the kind of HTML-injection errors mentioned in the Security tutorial (though that tutorial's approach of using filter_var is inadvisable); they'd be vulnerabilities if any user-input data was present. The Validation tutorial makes the exact same mistake warned about at the start of the Security tutorial! - bobince
(1) Yes but they mention SQL injections and what to do against them at the end of article! - Richard Knop
8
[+1] [2010-01-22 20:23:13] Elzo Valugi

I think you need books not tutorials. PHP is a complex programming language and no tutorial no matter how complex will give you all you want.

The best tutorial for PHP is the PHP documentation [1] itself. Is complete, with a lot of examples and with interesting user contributions.

[1] http://es2.php.net/docs.php

Sure, books are good. Sadly I've reviewed several PHP books including some that are aimed specifically at “security” and they've always been full of obvious holes and hopeless approaches to security, even the ones from the normally-impeccable publishers. So if you can recommend a particular good book that's very welcome! As for the PHP docs, I agree they're mostly good, but the amount of tutorial content in them is very small, not something I can easily throw at new users. - bobince
9
[-4] [2010-01-22 20:40:38] streetparade

Well i think Zend Developer Zone is a good Place to Start.

Here is a Tutorial:

PHP For the Absolute Beginner http://devzone.zend.com/node/view/id/627


(21) Unfortunately, this tutorial is an complete disaster, exhibiting pretty much every negative point I listed. SQL injection, HTML injection in absolutely every case, mixing echoed output into the middle of actions, inconsistent indentation. It has the footnote-style security chapter I was complaining about, except even then the security chapter addresses no real security issues and has full-on SQL-injection exploits. Woeful. - bobince
10