Very simple php/mysql question

Portal Home > Knowledgebase > Articles Database > Very simple php/mysql question

Posted by lexington, 09-23-2007, 10:53 PM
Hello, I have never coded something like this before so I am not sure how to go about it but I see it used in scripts all of the time. I know how to update a database table with the SET command, but if I wanted a table field to store multiple data separated by a comma, how do I insert the entries but leave the last entry without a comma? So the DB field would have a field named "banned_ids" for example, and instead this field would contain the ids "1,4,8" so if anyone with a matching ID in this field would not have access. So I wonder how to make it so that the last comma at the last entry isn't there since I know that having a comma at the end causes the script not to work correctly. Thanks.
Posted by QSDConsulting, 09-23-2007, 11:15 PM
If I understand your post (It was rather unclear) UPDATE `table` SET `banned_ids` = '1,4,8' WHERE `id` = '1' OR `id` = '2' You can keep using OR as much as you want.
Posted by lexington, 09-23-2007, 11:31 PM
Well this query will be automated, so I would need for it to detect the ID from a $_GET or $_POST command and update the DB with the new ID. However I was told in the past that if there is a comma at the end of the last ID, the script will not work correctly when trying to select each ID with preg_match for comparison to the ID that is banned. So lets say there are five people on a page and two of them are troublemakers and need to be banned. The admin can click on the "ban" button which would add the first troublemaker's ID into the banned_ids column (in this example lets say his ID is 20). Then the next ID of another troublemaker the admin bans which is 7 would also be added to the banned_ids column. Since there cannot be a comma at the end of the db entry, how can this query insert the IDs such as 20,7 into banned_ids without making it such as 20,7, with a comma at the end? That is what I am asking and I hope this makes sense. Every script that I look at that inserts multiple data into one DB field never has a comma at the end of it.
Posted by QSDConsulting, 09-23-2007, 11:42 PM
I will need to see the code. Why are you storing all the IDs in one row? They should each have their own row.
Posted by lexington, 09-23-2007, 11:48 PM
Hmm, so you never have seen multiple IDs into one table before? I see it used in nearly every large DB interactive php script on the internet. I think it would be too much to have to create a new DB table for banned Ids and add the code to select each column when I could just add a new field into an existing DB field that contains the IDs. It isn't like this field would be huge anyway, no more than four Ids per table.
Posted by ZackN, 09-24-2007, 12:56 AM
If you know there's already IDs in the field, and there isn't a comma at the end, just do a SET banned_ids = CONCAT(banned_ids, ',ID') If you don't know if there's an ending comma, grab the field and check. I excluded the SELECT and UPDATE queries, but this should help with avoiding the ending commas.
Posted by lexington, 09-24-2007, 01:47 AM
Ah yes that should work I will test it later. Thanks
Posted by QSDConsulting, 09-24-2007, 08:14 AM
It would take longer to extract all the IDs from a database, explode them and see if any exist in the array then to use a while statement. I dont know what large applications you are referring to.
Posted by foobic, 09-24-2007, 09:40 AM
I agree with VI - use of comma-delimited lists in data is always an indication of bad database design. You should probably store the banned ips in their own table and include that in your query. If you must do this, handle the list always as an array. Create the string you store in the database with implode (',',$array). Convert the stored string to the array with explode(',',$string).
Posted by lexington, 09-24-2007, 10:23 AM
It is only 3 entries max per table so I doubt the server would struggle exploding that.
Posted by QSDConsulting, 09-24-2007, 01:52 PM
Its still bad design.
Posted by ZackN, 09-24-2007, 04:25 PM
I'm sure lots of large applications use this. I can think of at least one off hand, vBulletin. It can be used for specific purposes, such as individual buddy lists. Instead of inserting a new row for each new buddy each user has, only one row is used for each user and their entire buddy list is in one field.
Posted by lexington, 09-24-2007, 04:50 PM
Yeah and I also hate to break it to you Idiot but I have a site with over 50,000 users and each of those users use such a field with text mind you, not numerals and they have hundreds of comma delimited entries and they can be called up on a page in less than a second with a server load of 0.14. So I do not see a problem there.
Posted by foobic, 09-24-2007, 06:53 PM
Oh well, that proves it then! This sort of bad design is pretty common - I think it usually happens when people discover late in the process that there's more data to store than they first thought and they're unable to (or just don't want to) change the database schema. If you have the chance to avoid it by designing your database right in the first place you'll build a better application as a result. If you don't care, that's your choice.
Posted by QSDConsulting, 09-24-2007, 08:16 PM
That doesn't make it the best method, it is less efficient then doing it my way. Last edited by QSDConsulting; 09-24-2007 at 08:19 PM.
Posted by ZackN, 09-24-2007, 11:22 PM
I've used storing array data in single fields on a site with heavy DB interaction and 2000+ concurrent users. If I did not do this, a table containing the same data would have taken ~10x the amount of rows, more space for repeat data, and more queries to get the same data spread out across a HUGE table. Now can someone please tell me how this is "bad design"?
Posted by Steve_Arm, 09-24-2007, 11:47 PM
Don't do it. Such design steps are non scalable. I've done it in my early days and regret it, when I had to update the system it was very difficult. Plus using indexes it's very fast for the db to select the data. Regarding storage requirements I've done some calculations: 50.000 members with 3 IDs for this record SMALLINT which is up to 65535 records and 2 bytes in size: 50.000 * 2 * 3 = 300.000 bytes size in DB. Using the many rows solution (1 byte per char): - Its 3 bytes for the IDs and 2 bytes for 2 commas: 50.000 * 5 = 250.000 bytes And that only for IDs up to 9, then you add 1 byte per digit. Numbers will beat it in size. Consider half of the records have an average of this: 345,346,347. That is: 25.000 * 11 = 275.000 bytes + half of the above = 400.000 bytes Last edited by Steve_Arm; 09-24-2007 at 11:55 PM.
Posted by foobic, 09-24-2007, 11:55 PM
Can't say without knowing what your schema is and what data you're storing in these lists, but typically a list of data in a field can be removed to a separate table, so the main table gets smaller, not larger, and if there's any question of repeat data you've got a major design problem. Simple example with list: or with 2 tables:
Posted by mdburton, 09-24-2007, 11:55 PM
ive used a design like this once also it bit me in the *** after awhile and i had to REcode things and REdo the database schema while it might make sense for some things, it definitly has some major limits especially indexing limits what if id 3 wants to see who all has him as a friend, are you going to program it to filter ALL of the fields looking for a 3 of every other member? or you can simply have a seperate table and it indexed by that field and get it done ALOT quicker and less CPU usage im not a professional like some of these might be, but that is my personal experience and reason of why it should not be used....
Posted by ZackN, 09-25-2007, 12:13 AM
You don't see the repeat data in there? Now let's say there's 10k users with friends, and each user has ~15 friends. On top of that, the average userid is right around 6 digits long. There will be a lot of repeat userids going on. This is what I am doing.
Posted by foobic, 09-25-2007, 02:11 AM
In case you missed mdburton's point, how will you answer the question: "Who has chosen Mary as a friend?" Good luck.
Posted by juangake, 09-25-2007, 09:22 AM
Folks, foobic is right. The comma-delimited data list should be considered always as a *fast hack* for very specific purposes. If you rely on this for everything, you're missing the basics of "relational database" (the concept that will open your eyes to a rainbow of nice, fast and cool new queries / questions to your database)... "The foobic way" seems repetitive, but it's just professional database design that allows [efficient] queries without crushing your database later.
Posted by ZackN, 09-25-2007, 05:13 PM
It's a two way friends system. If you're on their list, they're on your list. No need to see "Who has chosen". I wouldn't call it a fast hack, although I am using it for a very specific purpose. If I have the time this weekend I'll do some basic speed/load tests on the two different methods using real data.
Posted by ThatScriptGuy, 09-25-2007, 05:22 PM
Do yourself a favor and enroll in any relational database class at your local community college. You WILL learn about good and bad practices. Comma delimited lists in a field = bad practice. Just trust those of us that have been doing this for awhile.
Posted by juangake, 09-26-2007, 08:54 AM
Going a step further in "Comma delimited lists in a field" war... If you have a table: id 1, name 'Fred', friendlist '5,3,7' id 2, name 'Peter', friendlist '1,7,2' ...when you want to delete friend '3' from the friendlist of user id '1', you have to: ---> pull the friendlist for that user (SELECT) ---> manually remove the '3' from the friendlist '5,3,7' ---> re-compose the friendlist, taking care to not have a trailing comma at the end of the string (the purpose of the thread) ---> UPDATE the friendlist of user id '1' (we have 2 queries now) Wow, that's a lot of effort for just removing an item from a list. I believed relational databases were created for something... *grin* Why not just a DELETE statement? Well, for that you need a table as foobic proposed... And... if you want to *insert* a new friend on the list? Oh yeah, you're going to pull the friendlist again (SELECT), check for yourself in your PHP code if that user id is already in the list, if not then add it to the friendlist, but NOT adding a trailing comma because is the last one... then UPDATE the row... With a good design, you could just do a nice INSERT (...) using [ON DUPLICATE KEY] and proper primary keys, letting MySQL do the work, and doing it fast when it comes to more complex or big operations. My two cents. As a side note: I was doing "Comma delimited lists in a field" myself. Until you learn how relational databases work and have to do bigger projects that must be scalable and/or efficient. Regards, Juan
Posted by ZackN, 09-26-2007, 04:55 PM
It's true that this will create two queries, however this is only used when removing friends. PHP's explode and implode work great for comma delimetered strings. Don't need to mess around with trailing commas, and it's easy to remove or add values to arrays. While not being very clean, something like should work. I also ran a couple tests on these two methods using 27k users and 400k friends. Time calculating is using microtime(true); in php5. The first test was selecting all userids of users who had a particular person on their list. For the comma fields, I used this query: For table: Trial 1: Comma-field Time: 0.0756509304047 Table Time: 0.118787050247 Trial 2: Comma-field Time: 0.0746519565582 Table Time: 0.119049072266 The second test was selecting the friends list. Comma-field query: Table code: Time taken includes getting friend_ids into an array. Used for comma-field: Used for table: Trial 1: Comma-field Time: 0.00429201126099 Table Time: 0.11908197403 Trial 2: Comma-field Time: 0.00414204597473 Table Time: 0.119457960129 Thoughts?
Posted by juangake, 09-26-2007, 05:55 PM
It's nice you take the time to perform this test. Well I just can't believe this without running my own tests on a properly indexed database I can play on a production database of 200k users, so I'll strike back!
Posted by ZackN, 09-26-2007, 06:15 PM
Ah, I completely forgot about indexes on this test. Changes the first test results to favor the table. Comma-field: 0.051148891449 Table: 0.0215289592743 The comma field still beats out the table in the second test. Comma-field: 0.0042359828949 Table: 0.0188090801239
Posted by juangake, 09-27-2007, 09:40 AM
Yes! Anyway very logic, just wins the method which needs fewer row scans. Ok, so my conclusion is that I still prefer the "table method" in order to be able to use LEFT JOIN and extract additional data in the same query for each friend. With "comma field" you would need to perform additional queries. I mean, once you got the 'friendlist' ids, how would you get their 'names', for example? You need a new query. A LEFT JOIN with the table method would do the trick. You can do magic using subqueries, but they are subqueries, so greater overload. Juan