Groupprops:Performance in search engines

From Groupprops
Jump to: navigation, search

One of the aims of the Groupprops wiki is to help people find information as quickly as possible, and a first step towards achieving this aim is to enable people to find the wiki itself as early as possible. In other words, we would like that in those areas where we have good and relevant content for people, they are able to find that content through search engine tools.

Methods of measuring performance

Using benchmark search terms

  • Standard search terms: These are typically taken from Category:Basic definitions in group theory
  • Particular search terms: These may be phrases or clauses like "normal versus characteristic" or "characteristic implies normal" or "what is a group?". Typically we try measuring the performance of those clauses/phrases that closely correspond to names of articles on this wiki.

Choice of search engines

Performance in search is measured using the following search engines:

Search options

We typically measure for:

  • Search within the wiki: This tells us whether that page, or relevant pages, have been indexed
  • Search across the world-wide-web: This tells us how that page, or relevant pages, compare with other pages having the term

General trends in search

Non-mathematical noise results

Since much of mathematical terminology is borrowed from ordinary English, a large number of search results that turn up initially are those from ordinary English, or from other fields that have borrowed the term from ordinary English. For instance, searching for "group" gives Google Groups as the top entry.

Mathematical noise results: Wikipedia and its clones

Because of the tremendous and growing prevalence of Wikipedia in free online encyclopaedic content, the Wikipedia entries on the topic, if they exist, usually tend to dominate over others. More problematic, though, is the dominance of many Wikipedia clones which copy content en masse from the encyclopaedia. These often tend to push the second best entry down by a page, and hence many searchers may miss out on this.

Usually, the extent to which Wikipedia articles get indexed and cloned is largely independent of their quality, and is because of the overall weight and inertia of Wikipedia.

Noise results: reference articles

It is not easy to search only for the definition articles, and often, many of the entries on the first page of the search are not definition articles, but rather, serious papers where the term gets repeatedly used. For instance, most of the results for "self-normalizing subgroup" are articles where the term may be used or casually referenced, rather than definitions for it.

Link to the wiki but to the wrong page

Interestingly, often even those links to the wiki that do exist do not point to the actual article page, but rather, to another page that links to that page. This may be because that other page was, for some reason, more highly linked to, or has for some reason been crawled by the bots.

Improving performance in search engines

Currently, search engine optimization is not one of the main concerns of the wiki. In any case, it is unlikely that the structure and organization of the wiki will be deeply influenced by search engine optimization concerns, given the fact that it is anyway so well-interlinked and that search engine policies keep varying. It is believed that the more we clean up the wiki, the better it will perform at least at the level of internal search.

Doing well on searches across the web, however, may require something different from and more than just cleaning ourselves up -- we need to ensure that there are enough places from outside that point to us! The cleanest way of doing this is to make sure that there are enough people who visit us and enjoy the site so much they put a link to it from their own pages.