PISTONHEADS SEARCH ENGINE
Discussion
I will jump on this thread in a bit and explain the whole project, why we did it, the logic and technology used behind it etc...
But I will add a few things for now:
But I will add a few things for now:
- We do release things in incremental releases - so what we have released now is not the end of development on this feature. We can tweak and add better functionality on to it (such as better pre-search criteria to stop you having to navigate the pyramid upside down).
- Google's business is search - it would be hard to compete with a multi-billion pound organisation who (primary) focus is on getting a great search engine... though we do know how to get near to it for Forum Search, but we need to build much more infrastructure to capture how users are using the search and what results they are clicking on to drive back relevance. The problem is justifying the cost of doing that detailed work, we all have to satisfy a bottom line at the end of the day.
Ok LordGrover, tried again. Put in 'TVR S brake servo', as written.
Search revealed lots of other marques. Went to TVR (which I shouldnt have had to do as it was in the search title) and opened. Went to S and hovered and got the option 'only', clicked on this and it gave me TVR S plus all the other marques. It did NOT restrict selection to S only. I didnt go further.
I am using windows 7 on a good modern laptop and browzing using Mozilla (up to date)
Search revealed lots of other marques. Went to TVR (which I shouldnt have had to do as it was in the search title) and opened. Went to S and hovered and got the option 'only', clicked on this and it gave me TVR S plus all the other marques. It did NOT restrict selection to S only. I didnt go further.
I am using windows 7 on a good modern laptop and browzing using Mozilla (up to date)
OK Pete, I think it is over to you. You have had some pretty hard feedback on the system so far. Onlyt fair to give you the opportunity to digest what has come back and consider the implications.
I have got to say it doesnt look like 'tweeking' to me. There seem to be some pretty fundamental issues to look at.
Maybe you should share the objectives of the project with us to see how relevant they are first?
My use is very regular but it is almost entirely related to gaining the latest technical information on repair and restoration, adding to that information from research I have done, and directing less experienced members to the best PH source of the advice they have sought. All that is almost entirely related to one model of one marque.
I have got to say it doesnt look like 'tweeking' to me. There seem to be some pretty fundamental issues to look at.
Maybe you should share the objectives of the project with us to see how relevant they are first?
My use is very regular but it is almost entirely related to gaining the latest technical information on repair and restoration, adding to that information from research I have done, and directing less experienced members to the best PH source of the advice they have sought. All that is almost entirely related to one model of one marque.
greymrj said:
Ok LordGrover, tried again. Put in 'TVR S brake servo', as written.
Search revealed lots of other marques. Went to TVR (which I shouldnt have had to do as it was in the search title) and opened. Went to S and hovered and got the option 'only', clicked on this and it gave me TVR S plus all the other marques. It did NOT restrict selection to S only. I didnt go further.
I am using windows 7 on a good modern laptop and browzing using Mozilla (up to date)
1. There is no need to include TVR S in the search box - it may even skew the results.Search revealed lots of other marques. Went to TVR (which I shouldnt have had to do as it was in the search title) and opened. Went to S and hovered and got the option 'only', clicked on this and it gave me TVR S plus all the other marques. It did NOT restrict selection to S only. I didnt go further.
I am using windows 7 on a good modern laptop and browzing using Mozilla (up to date)
2. If you selected Only next to S Series then something's wrong. On my computer it clears ALL other checkboxes. Perhaps that's where your issue lays?
greymrj said:
OK Pete, I think it is over to you. You have had some pretty hard feedback on the system so far. Onlyt fair to give you the opportunity to digest what has come back and consider the implications.
I have got to say it doesnt look like 'tweeking' to me. There seem to be some pretty fundamental issues to look at.
Maybe you should share the objectives of the project with us to see how relevant they are first?
My use is very regular but it is almost entirely related to gaining the latest technical information on repair and restoration, adding to that information from research I have done, and directing less experienced members to the best PH source of the advice they have sought. All that is almost entirely related to one model of one marque.
Might be tomorrow morning the way today is going... but will do, and thank you for your use case - as all helpful.I have got to say it doesnt look like 'tweeking' to me. There seem to be some pretty fundamental issues to look at.
Maybe you should share the objectives of the project with us to see how relevant they are first?
My use is very regular but it is almost entirely related to gaining the latest technical information on repair and restoration, adding to that information from research I have done, and directing less experienced members to the best PH source of the advice they have sought. All that is almost entirely related to one model of one marque.
LordGrover said:
1. There is no need to include TVR S in the search box - it may even skew the results.
?
I am going to leave this to the webmaster for a bit. There is now plenty of evidence that the search function doesnt meet the users needs as it stands. However, before I do so can I ask you to look again at the above statement you made. With respect, I suggest you think hard about that statement and what it means in the context of the objectives of a search function! I will leave that with you.?
Why we changed Forum Search?
We have over 30 million posts, and around 10,000 new posts every day, across 200 plus forums.
Google may be easy to setup, and use their algorithms for searching - but it has limitations:
We use a technology called Elasticsearch to run the hardware underneath. This is a dedicated search technology based on Lucene and is becoming more and more popular as a technology. With Elasticsearch it has some features which means that it can do relevancy better than we could programme in:
Not all users put the specific thing they are talking about in the title, and there are quite a few threads which have the gem of the information in the post that is not in the title, especially those that say "Looking to buy Honda S2000" etc.
Date is not weighted very highly, as we feel that the content is the key to a result, but happy to look into this to see if recency should be weighted higher.
There are going to be multiple use cases for searching the forums, and the feedback we have had has been massively positive. But we will take your cases and look at how we can improve the result set for them.
We have over 30 million posts, and around 10,000 new posts every day, across 200 plus forums.
Google may be easy to setup, and use their algorithms for searching - but it has limitations:
- It doesn't index or search any hidden forums
- You cannot limit to posts by a user
- You cannot reliably search between dates
- You cannot search specific forums
- We are at the mercy of Google crawling our site for updates
- We cannot customise it - having to take Google design for it
- We cannot differentiate tracking from it to normal traffic coming from Google
- Search all the posts across the site
- Or posts by a user
- Or by a date
- Even in hidden forums (useful for moderators)
- Have real-time updates (within 1 minute of a post being added)
- Be responsive across multiple devices
- Cope with growth
- Super fast response times
We use a technology called Elasticsearch to run the hardware underneath. This is a dedicated search technology based on Lucene and is becoming more and more popular as a technology. With Elasticsearch it has some features which means that it can do relevancy better than we could programme in:
- Stemmers (so being able to take words such as "speeding" and search for "speed", "speeds" etc)
- Filter shingles (this allows it to look at proximity of words next to each other for relevance, so the mention of BMW and M3 in the same post is not as relevant as BMW M3 when next to each other)
- Stop words (removal of common english language words in the search terms, e.g. and or in if it etc.)
- Synonyms (allows us to specify forum colloquisim into same words, e.g. porker, porsche etc)
- Phrase suggester (provides an alternative spelling if the user has entered something wrongly when searching)
- Forum title
- Forum posts
- Date
- Forum poster name
- Forum
Not all users put the specific thing they are talking about in the title, and there are quite a few threads which have the gem of the information in the post that is not in the title, especially those that say "Looking to buy Honda S2000" etc.
Date is not weighted very highly, as we feel that the content is the key to a result, but happy to look into this to see if recency should be weighted higher.
There are going to be multiple use cases for searching the forums, and the feedback we have had has been massively positive. But we will take your cases and look at how we can improve the result set for them.
I hear you. Lets give you time? Immediate thoughts:
Personally I almost always find date to be of very high importance so I would certainly want to see it have a higher weighting. Certainly I would have expected the last column to be in reverse order. i.e. latest as the default position, with the ability to search earlier if required.
One big potential advantage over google is the potential ability to locate the last post on a subject whereas Google find the date by the start of the last thread on the subject.
Is it possible to give some differecne in weighting to threads which have the search subject in the title, over those which have it in the content. i.e to prioritise threads which are ABOUT the subject well over those which merely mention the subject. That differentiation did not seem to happen and most of the 'most relevant' items the search found were actually of low relevance to that search subject.
I was very worried by LordGrovers comment that by being more specific in the search subject it could skew the results, I have to say that amazed me. If I put in TVR S as part of the subject then I expect the search to be limited to posts relevant to TVR S!
I would certainly expect the first selection in the current format to 'tick' only what I asked for, if that was in my search subject. To have to uncheck the bits I do not want seems a poor approach. If you want to search across the whole of PH for someone mentioning 'servo' then by all means expand your search, but I would have thought the rest of us wanted the tip of the pyramid, or at least to start from there.
Personally I almost always find date to be of very high importance so I would certainly want to see it have a higher weighting. Certainly I would have expected the last column to be in reverse order. i.e. latest as the default position, with the ability to search earlier if required.
One big potential advantage over google is the potential ability to locate the last post on a subject whereas Google find the date by the start of the last thread on the subject.
Is it possible to give some differecne in weighting to threads which have the search subject in the title, over those which have it in the content. i.e to prioritise threads which are ABOUT the subject well over those which merely mention the subject. That differentiation did not seem to happen and most of the 'most relevant' items the search found were actually of low relevance to that search subject.
I was very worried by LordGrovers comment that by being more specific in the search subject it could skew the results, I have to say that amazed me. If I put in TVR S as part of the subject then I expect the search to be limited to posts relevant to TVR S!
I would certainly expect the first selection in the current format to 'tick' only what I asked for, if that was in my search subject. To have to uncheck the bits I do not want seems a poor approach. If you want to search across the whole of PH for someone mentioning 'servo' then by all means expand your search, but I would have thought the rest of us wanted the tip of the pyramid, or at least to start from there.
In the specific TVR S search, there is a very generic word "S" in it, which means it is hard to know that is a specific model... though we did look at telling the system that, and something we may revisit. If the search was for TVR Tuscan, you will probably see better results as that word is more unique.
The other thing Google does is to monitor the click through, dwell times after clicking, number of pages viewed etc to then drive back into the search results. This allows it to tune the results on user behaviour to get out better sentiment of searching. For example "TVR S steering rack" could have several reasons for that search. Maybe someone wants to buy one, or fix one, or just general info about one.. or something else.
By looking at the results clicked on and driving that back in, then that helps move the better sentiment higher in the relevancy of results. This is something we are looking at, but won't be till the new year.
The other thing Google does is to monitor the click through, dwell times after clicking, number of pages viewed etc to then drive back into the search results. This allows it to tune the results on user behaviour to get out better sentiment of searching. For example "TVR S steering rack" could have several reasons for that search. Maybe someone wants to buy one, or fix one, or just general info about one.. or something else.
By looking at the results clicked on and driving that back in, then that helps move the better sentiment higher in the relevancy of results. This is something we are looking at, but won't be till the new year.
I take your point about 'S' being potentially generic, although it does still appear as a separate model in your first field. We S guys wouldn't like to be missed!
At the end of the day the proof will be in the pudding. As it stands it is of little value to me for the purposes in which I search on PH. I appreciate that this doesnt mean it is of little value to others. You may have noted another thread on the TVR S forum on which I asked other members to test the search facility. Several prominent members did, and the consensus rather supports my view.
Include S as a model!
I am not sure what LordGrover was on about but allow the searcher to be more rather than less specific in their search subject.
Make it so you select anything you want other than the subject model, rather than deselect.
Think hard about the weight given to 'date' and see what proportion of searches are answered best by latest information. I still remain to be convinced about defining 'relevance' as it is a qualitative rather than quantitive matter.
Give threads ABOUT a subject more 'relevance' than posts which merely mention the subject wording.
Initially filter to give priority to most up to date posts, with option to reverse this.
How long do I leave it before testing again!
At the end of the day the proof will be in the pudding. As it stands it is of little value to me for the purposes in which I search on PH. I appreciate that this doesnt mean it is of little value to others. You may have noted another thread on the TVR S forum on which I asked other members to test the search facility. Several prominent members did, and the consensus rather supports my view.
Include S as a model!
I am not sure what LordGrover was on about but allow the searcher to be more rather than less specific in their search subject.
Make it so you select anything you want other than the subject model, rather than deselect.
Think hard about the weight given to 'date' and see what proportion of searches are answered best by latest information. I still remain to be convinced about defining 'relevance' as it is a qualitative rather than quantitive matter.
Give threads ABOUT a subject more 'relevance' than posts which merely mention the subject wording.
Initially filter to give priority to most up to date posts, with option to reverse this.
How long do I leave it before testing again!
227bhp said:
I must be missing something here as i'm having difficulty fathoming out how 'Does not work' can be discussed for three pages!
I put in a single word or phrase
Search engine says "We found 0 results for your search "xxxx"
Even following the earlier hyperlinks gives the same results.
What phrase did you try?I put in a single word or phrase
Search engine says "We found 0 results for your search "xxxx"
Even following the earlier hyperlinks gives the same results.
greymrj said:
I take your point about 'S' being potentially generic, although it does still appear as a separate model in your first field. We S guys wouldn't like to be missed!
At the end of the day the proof will be in the pudding. As it stands it is of little value to me for the purposes in which I search on PH. I appreciate that this doesnt mean it is of little value to others. You may have noted another thread on the TVR S forum on which I asked other members to test the search facility. Several prominent members did, and the consensus rather supports my view.
Include S as a model!
I am not sure what LordGrover was on about but allow the searcher to be more rather than less specific in their search subject.
Make it so you select anything you want other than the subject model, rather than deselect.
Think hard about the weight given to 'date' and see what proportion of searches are answered best by latest information. I still remain to be convinced about defining 'relevance' as it is a qualitative rather than quantitive matter.
Give threads ABOUT a subject more 'relevance' than posts which merely mention the subject wording.
Initially filter to give priority to most up to date posts, with option to reverse this.
How long do I leave it before testing again!
It sounds like you'd prefer that it only searches on post subject by default. At the end of the day the proof will be in the pudding. As it stands it is of little value to me for the purposes in which I search on PH. I appreciate that this doesnt mean it is of little value to others. You may have noted another thread on the TVR S forum on which I asked other members to test the search facility. Several prominent members did, and the consensus rather supports my view.
Include S as a model!
I am not sure what LordGrover was on about but allow the searcher to be more rather than less specific in their search subject.
Make it so you select anything you want other than the subject model, rather than deselect.
Think hard about the weight given to 'date' and see what proportion of searches are answered best by latest information. I still remain to be convinced about defining 'relevance' as it is a qualitative rather than quantitive matter.
Give threads ABOUT a subject more 'relevance' than posts which merely mention the subject wording.
Initially filter to give priority to most up to date posts, with option to reverse this.
How long do I leave it before testing again!
That might be appropriate for your usage, but quite probably not for many other users. For example, a lot of the posts in GG are 'What Car ' type posts which, more often than not, don't include any vehicle details in the title.
Similarly, I've found that searching across all forums by default returns more useful results for me and with fewer clicks.
I'd also prefer it If it defaulted to descending order when date is selected.
Perhaps it needs user configurable defaults for 'search subject only' and 'default sort' ?
rscott said:
227bhp said:
I must be missing something here as i'm having difficulty fathoming out how 'Does not work' can be discussed for three pages!
I put in a single word or phrase
Search engine says "We found 0 results for your search "xxxx"
Even following the earlier hyperlinks gives the same results.
What phrase did you try?I put in a single word or phrase
Search engine says "We found 0 results for your search "xxxx"
Even following the earlier hyperlinks gives the same results.
"We found 0 results for your search "What phrase did you try""
227bhp said:
rscott said:
227bhp said:
I must be missing something here as i'm having difficulty fathoming out how 'Does not work' can be discussed for three pages!
I put in a single word or phrase
Search engine says "We found 0 results for your search "xxxx"
Even following the earlier hyperlinks gives the same results.
What phrase did you try?I put in a single word or phrase
Search engine says "We found 0 results for your search "xxxx"
Even following the earlier hyperlinks gives the same results.
"We found 0 results for your search "What phrase did you try""
Gassing Station | Website Feedback | Top of Page | What's New | My Stuff