谷歌 Patent Granted on Web Link Spam

分享是关怀!

当搜索引擎为网络上的页面和其他文档编制索引时,希望为搜索者提供有意义且相关的结果时,它不会’不仅要依赖网页上的内容,还要考虑指向这些页面的链接的质量和数量。

examples of link farms and 集团攻击s

A search engine like 谷歌 might determine that a page is relevant to a specific query based upon the content found on that page, and the anchor text found in links pointing to the page.

它也可能会考虑“relationships” between pages 通过 looking at how pages are linked to each other. PageRank is one method of viewing those links that 谷歌 states that it uses, and assigning a measure of importance to pages that are linked to from other pages. This measure, or rank might be simplified as a probability that someone might arrive at a certain page if they are arbitrarily and randomly clicking on links on pages that they’ve surfed.

This combination of relevance in content and anchor text, as well as importance based upon link 关系 helps to determine the order that pages show up in response to queries from searchers. While it’s possible that 谷歌 might proceed to rerank a certain number of those top search results based upon other signals, this method of determining the top results can influence whether or not a page might be seen 通过 searchers.

但是,那里’基于链接的排名方法(例如PageRank)存在问题。它’可能有意操纵页面之间的链接结构以人为地增加某些页面的排名。

A patent granted to 谷歌 today describes a way that the search engine might use to identify two different methods of spamming pages, and take action against 人为夸大的重要性 (or PageRanks) for pages.

这种识别链接垃圾邮件的方法涉及查看到页面链接的样本,以查看搜索引擎是否可以识别出现在几种不同类型的操纵性链接中的某些特征。

链接农场和集团攻击

搜索引擎可能会浏览页面的许多链接,以查看这些链接是否共享某些特征,而这些特征可能与真实页面(’t进行操纵性链接)。

的description in the 谷歌 patent specifically picks out two types of link spam, link farms and 集团攻击s, and explains how links involved in those behaviors might be different than links to authentic pages.

链接农场 –链接服务器场通常是一大组页面,主要是为了指向单个页面而创建的,目的是错误地给人以指向的页面很重要的印象。

一个例子可能是电子商务网站的主页,通过创建许多网站来人工提高排名“dummy”都具有指向主页链接的网页。如果搜索引擎考虑了链接服务器场中的链接,则这些链接可能会使该网站在搜索结果中排名更高。

在链接服务器场中,所有指向中心页面的页面的重要性得分(或PageRanks)往往都非常低。真正重要的页面除来自低排名页面的链接外,更有可能会包含来自某些高排名页面的链接。

集团攻击 – Another type of link spam is the 集团攻击, or 我们带, which is a set of pages that predominantly point at each other, to present a false appearance of authority or importance.

的pages in this kind of 集团攻击, or 我们带, don’t link much outside of the other pages in the ring, and their links to each other might cause each site to appear higher in search results if the links from the 我们带 are considered 通过 the search engine. Many of these will tend to not link out to other pages outside of the 我们带, like authentically important pages might.

对人为抬高的重要性采取行动

When pages that are likely to be spam link have been located, under this patent, 谷歌 take action to account for the “人为夸大的重要性” of those pages.

第一步,可以使用人工审查或其他算法来检查这些页面是否被用作垃圾邮件链接方案。

如果确定页面为垃圾链接或候选垃圾链接,则可以采取以下措施:

  1. 在确定其他页面的链接重要性时,可能根本不考虑页面的链接。
  2. 的impact of links from the page might be reduced in importance.
  3. 可能会对页面链接的重要性施加预定的惩罚。
  4. 页面的重要性可能会以某种方式降低’t rely upon links.
  5. 页面的重要性可能会以某种方式降低’依赖链接,同时也降低了页面中链接的重要性。

的patent does go into depth on some of the math behind the identification of link spam in link farms and 集团攻击s, and is worth spending time with if you want to delve deeper into how 谷歌 might use the methods described in the patent:

在超链接数据库中检测链接垃圾邮件的方法
由Sepandar D. Kamvar,Taher H.Haveliwala和Glen M. Jeh发明
Assigned to 谷歌
美国专利7,509,344
2009年3月24日授予
申请日期:2004年8月18日

分享是关怀!

关于48个想法“Google授予Web链接垃圾邮件专利”

  1. 非常有趣的是,自提起诉讼以来,事情肯定已经发生了。另一个很棒的帖子!

  2. 嘿比尔

    这是一个有趣的。再次让我想到的是,这是2004年提起的。我认为是’可以公平地说,事情可能从那里迅速发展。

    最佳RGDS
    理查德

  3. 嗨,理查德,

    This patent does seem to fill in a void in understanding how 谷歌 might be looking at links and link spam. For a long time, many people where mistakely attributing Yahoo’s concept of Trustrank to 谷歌, and in this patent we seen a different approach completely.

    It is likely that 谷歌 has incorporated other or additional ways of finding and acting on link spam since this was originally filed, but I think it provides some insight into an area where we hadn’t seen too much directly on the topic from 谷歌 previously.

    卧龙岗嗨

    谢谢。已经好几年了’s probable that 谷歌 and the other search engines have been picking up many new ideas through resources like the 航空网 讲习班。

  4. 它表明它肯定不是’值得冒险访问您的网站’s rankings 通过 joining a 我们带 or link farm. 您 don’不想找到困难的方法。它’很难摆脱一个你 ’也需要重新修复,这可能需要您花费很多时间才能恢复。

    我可以’相信这是5年中最美好的时光!

  5. 嗨,迈克尔,

    Thanks. I agree with your criticism, especially of the use of terms that 谷歌 included in the patent. I wouldn’不会说这些是SEO行业已经很好定义和确立的术语–工业和学术研究人员,网站所有者,网络用户以及其他人也帮助他们赋予了意义。

    As you note, 我们带s have been around since the very early days of the web, before 谷歌 or Yahoo, and their purpose was to provide a way to navigate from one site to another that shared some common theme or purpose. Webring.org itself dates back to 1994, and has survived being purchased and then abandoned 通过 Yahoo. 我不’t like 谷歌’s use of the term “web ring”还是在该专利中,但认为其使用“clique attack” is more useful.

    谷歌’专利中对链接服务器场的定义是指一组指向一个页面的许多页面,这是对链接服务器场的狭窄定义,它可能涉及大量彼此过度链接的页面。

  6. This patent application provides some insight into just how poorly 谷歌 understood Webspam five years ago. It also raises some questions. How is it that 谷歌 so consistently manages to botch up the well-documented definitions established 通过 the SEO industry?

    链接场不是一组都指向一个中央站点的网站。为了成为链接服务器场,每个成员站点都必须链接到每个其他成员站点。

    Webrings与集团,Web垃圾邮件或搜索引擎绝对无关。在Webring中,每个成员站点共享导航代码(可以或可以不为搜索索引),以帮助访问者在Webring中找到其他站点。织网在1990年代和Yahoo!中非常流行。实际上购买了最大的Webring服务Webring.org。

  7. Think of the patent application language as a snapshot of 谷歌’是在2004年提出的。尽管专利权是在今年才授予的,但他们还是在多年前(以Internet术语)提出了这些想法和论点。

    谷歌 has rolled out two major redesigns of its search technology (Bigdaddy/Google 2.0 in 2006 and Searchology/Google 3.0 in 2007).

    今年我们’re seeing them roll out new semantic features that have been hinted at in some patent applications as well. However, no patent is really going to provide us with much information about what 谷歌 may be doing now.

  8. 嗨,索伦,

    该专利似乎确实有限,没有’t it. 🙂 谷歌 doesn’讨论了过多的公共垃圾邮件处理方法,但他们还发布了至少其他几项专利申请,涉及识别采用不同方法的网络垃圾邮件。

    我的帖子标题 谷歌 Patent on Web Spam, Doorway Pages, and Manipulative Articles involves a granted patent from 谷歌 originally filed in 2003. It provides a wider and more complex approach to identifying web spam. Another post, 基于短语的信息检索和垃圾邮件检测 提供了一些有关如何在基于短语的索引系统中识别垃圾邮件页面的信息。

    谷歌 has also been a participant in the 航空网 workshops with other search industry and academic members. I think it’s safe to say that 谷歌 does know more about link spam than what is reflected in this patent.

  9. 谢谢,迈克尔。

    It can take a long time for a patent to go from just filed application to granted patent. One of 谷歌’s vice presidents, Udi Manber, mentioned in an interview last April that 谷歌 updated their search algorithm over 450 times in 2007 alone. Thankfully not every change is captured in a patent filing. 🙂

  10. 我知道很多站点使用目录内容管理系统来创建数百个甚至数千个指向其网站的链接。这不’对于像这样的事情,现在看起来像过去一样运作良好‘Google bombing’, etc.

    Judging from this 谷歌 patent, it will be even less effective in the future.

  11. It is hard to distinguish which sites are link farms and which are not because 谷歌 can easily accuse sites to being link farms rather than a 我们带.

  12. 嗨,艾伯特,

    很难区分链接在一起的页面,因为这些链接为访问者提供了价值,而链接在一起的页面只是为了彼此增加’s ranks. But, I’在许多页面上,网站所有者明确指出彼此链接的目的是帮助彼此提高搜索排名。在这种情况下,很容易分辨出这些链接不是’在那里帮助访问者找到有用的相关资源…

  13. 一篇很有启发性的文章,道具!我有一个问题:我拥有多个具有相同IP的页面,’t try to hide this “network” from 谷歌. They are all linked like a 我们带 but is it really damaging if I just advise visitors of my other projects? I’m quite scared 谷歌 don’t realise that 我不’不想欺骗他们。 ðŸ〜‰

    弗洛里安的问候

  14. 嗨弗洛里安,

    谢谢。关于您的多个页面,这可能是规模问题以及搜索引擎可能会如何进行。如果你’在谈论少数几个站点’不如成百上千的那么糟糕。如果看来你’不要隐藏任何东西,并且由于合法和合理外观的网站的共同所有权,您的链接在那里,’比将数百个虚假博客或博客链接在一起要好得多。

    搜索引擎不’就像相互链接的网站主要是为了提高彼此的排名,这对搜索者而言几乎没有价值。如果您的页面也链接到其他网站,并且看起来像是对访问者有用的资源,那么与从其他页面抓取并聚合内容并且全部或大部分为低排名和低质量页面的网站相比,您不必担心。

  15. I’ve been a website designer for over 13 years and this is the first of of hearing about a 集团攻击. I wonder if it’该算法可能会选择具有多个链接的站点的合法链接作为链接场。

  16. 嗨乔,

    It’s possible that you’ve tripped over 集团攻击s many times without recognizing them, or knowing that someone at 谷歌 was calling them 通过 that name.

    It is possible that some aspects of legitimate linking, such as webrings, may seem similar to 集团攻击s, but usually when someone gets involved in a 我们带, they are also getting links from other places as well, and linking out to other sites in different ways, too.

  17. 如果今天的人们愚蠢地参与链接农场,那么他们应该被搜索引擎打扰。网上有足够的关于该主题的文章,任何被发现这样做的人都应该清楚知道。有趣的是,有些人会竭尽全力提高他们的在线排名。

  18. 嗨,比尔,

    有一些人参与了链接农场,他们知道潜在的风险,无论如何都会这样做。有些人最终陷入了将业务在线的其他方面的困扰,以至于他们没有’请充分注意可能导致搜索引擎出现问题的各种做法。任何人将网站置于在线状态,并希望它在搜索引擎中能很好地显示,都应该花一些时间在搜索引擎指南上,该指南会警告链接服务器场。

    I’我们曾经看到过因参加链接交换程序而受到惩罚的站点,这些站点从艰难的过程中吸取了教训。对于这些站点,清理它们的站点后有时可能会包含在内。许多唐’不知道如何为他们的站点建立链接,并确实寻求他们应该避免的一些链接建立可能性’t.

  19. 有趣的文章和有趣的评论,尤其是关于2004年申请和2009年赠款之间的差额–您可能会认为现在一切都已过时。知道有多少内置在其中会很有趣‘caffine’.

  20. 嗨,马克,

    谷歌’s caffeine is an update of how 谷歌 stores and accesses information in their databases – the basic infrastructure of their data storage and collection. 的impact of that will likely be that they can store more information, and acccess it quicker. 我不’认为不会直接影响他们识别链接垃圾邮件并采取行动的方式,但是’他们更强大的基础架构可能会通过允许他们在网络垃圾邮件上承担更多的资源而产生间接影响。

  21. 它使您想知道SEO中的链接构建是否将成为过去。随着垃圾邮件越来越成为问题,谷歌需要多长时间将其从算法中排除…

  22. 嗨,尼尔,

    链接 had value in the days before search engines that relied upon them for ranking, and they likely will after search engines place less reliance on links. 那里’向您网站上的访问者提供链接仍然很有价值。

    I do think the search engines are finding more ways to address link spam, advanced beyond what is described in the 谷歌 patent that is the topic of this post. 谷歌’s exclusive license to use PageRank does expire next year, but the PageRank that they use today is likely very much different than the PageRank of the 90s. We also know that 谷歌 and the other major search engines look at a very large number of other signals in ranking pages, and will likely continue to do so.

  23. 非常有益的条例草案。感谢您对Florian的回答’的问题。我想知道同样的事情!!!

    And I agree with 比尔·加塞特. 我不’不知道为什么人们会打扰链接农场。它的常识。

  24. 嗨,萝莉,

    您’re welcome.

    有些人会尝试利用诸如链接农场或博客,付费链接或其他网络垃圾邮件之类的东西作为创建反向链接的廉价方法,尽管与它们相关的风险可能很高,尤其是当目标很短时长期利益。我确实相信搜索引擎在识别网络垃圾邮件方面会越来越好,但是即使风险很高,也可能总有人会尝试测试搜索引擎,并从中获得某种好处。

  25. I have a local competitor that is involved with a large 我们带, and he dominates the search engines. When I first started to build my online presence, I was copying his methods. I now know what he is doing is spam, but he still ranks high.

    我不’明白了。他显然在操纵搜索引擎,但他继续在我的房地产领域的每个主要关键字中排名第一。他怎么能成为明显的垃圾邮件却又不被打耳光?

  26. @尼尔

    如果搜索引擎放弃链接数作为衡量网页排名的标准,我会非常非常惊讶。除了上下文链接以外,其他什么形式的信息也可以“computer”用作做出上下文决策的基础?确定您有类似的事情“bounce rate”,但这些指标可用来评估“trustworthiness”的链接。我个人认为搜索引擎是“stuck”使用链接作为在可预见的未来挖掘人的观点的基础手段。

    标记

  27. 嗨丽莎,

    继续做自己在做的事情,并尽可能避免垃圾邮件。尽管您的竞争对手目前可能会成功,但是如果搜索引擎将其所做的事情视为垃圾邮件,那么他很容易在任何时候失去排名。

    搜索引擎尝试以编程方式而不是个案解决垃圾邮件方法。因此,尽管某人可能会暂时摆脱某些东西,但很有可能会赶上他们。它’您为他的网站看到的某些链接也有可能’此时搜索引擎也将其计算在内。

  28. 嗨,马克,

    那里 are other areas that the search engines are exploring to determine the importance of a web page, from user-behavior signals other than 跳出率 (such as time spent on a page), to annotations in bookmarks and tags and search wiki’和社交网络。链接可能继续在搜索引擎如何对页面进行排名中发挥作用,但是将来这种作用可能会越来越小。

  29. 对排名最高的作品做了一些研究“make money online”博主和会员,并且有大量的serp“manipulation”四处走动。您如何看待这种利基市场以及所采用的策略?

    最好的祝福,
    特隆 $ Moneyonline.net

  30. It’s the biggest irony, Bill. These gurus supposedly teaching how to 网上赚钱 are making money online 通过 telling others how they 网上赚钱.

    这几乎是如何完成的:

    上师:付钱给我’告诉你我是怎么赚钱的。
    菜鸟:好的,我付了你钱。现在告诉我你的秘密。
    大师:嗯,就是那里。我刚在网上赚钱。
    菜鸟:那’s it?
    上师:是的。现在去做同样的事情。

    在Noob卖了几本电子书之后,他’s a Guru.

    重复一遍。

    美丽的系统,是吗?

  31. 布兰登:我’自96年以来一直从事该行业。’ve还撰写了有关SEO,社交媒体营销和数字策略的书籍。另外,我拥有20多个国际认可的技术认证(Cisco CCNP,CCIE笔试,Microsoft MCSE,Master CIW等)。

    我不’看不出为什么我不应该向我的读者提供我的知识和服务的任何充分理由吗?一世’一个喜欢我的工作并喜欢帮助别人的普通人。如果我的经验和知识可以帮助别人实现自己的目标,而没有像我这样的流血和泪水,那么这怎么被认为是更多“monkey business”比纽约SEO?我在2006年建立了自己的SEO公司,并在2009年以相当可观的价格将其出售,但之所以这样做,是因为这笔钱没有激励我。我只是想念书呆子的东西,并拥有更多的自由。拥有自己的公司并不是我所期望的。我比CEO更喜欢SEO。一世’作为首席执行官,我比首席执行官要好得多。

    I’确保您和比尔都在赚钱“cloud” too, right?
    要香蕉吗?一世’请带些花生ðŸ™,

    最好的祝福,
    特隆

  32. 嗨Trond,

    那里’您通过提供旨在帮助他人的书籍和服务的实际价值来做的工作与某些人提供的等同于您所做的工作之间的区别“如何赚钱填充信封” scams.

    Unfortunately, there are a lot of scams that offer people the opportunity to 网上赚钱, and don’t. It’不难找到它们。

  33. 嗨,比尔,

    的link to the patent is not live at present so maybe you can email an updated version as I’d有兴趣进一步阅读。

    在阅读这篇文章时,我想到了几件事。

    1.它’s still a grey area as to how much link juice is allocated for each link from a site. 谷歌 have never been very specific on whether any page has a cetain amoutn that is diluted every time a link is given out or if each link is effectively run in parallel and doesn’t完全稀释。马特·卡茨(Matt Cutts)似乎总是也回避这个问题。

    2. Providing people give good valued content then 我不’看不到回馈并链接回其站点的任何问题,即使使用锚文本也是如此,因为任何手动审核仍将页面视为相关且有价值的内容,因此,链接的页面也更可能与良好的内容相关。

    3.我也想谈一谈布兰登提出的观点。他描述的是一个金字塔/庞氏骗局。现在,与大多数MLM公司的工作方式不同的是,前者绝对没有价值,而后者的产品/服务则具有很大的价值(大部分),因此用户会得到一些好处。如果他们想要将其推荐给其他人,那就太好了。幸运的是,当今大多数传销公司都提供优质的产品和服务,但仍然存在一些骗局,这真是令人遗憾。

  34. 嗨贾斯汀,

    的link to the patent is working properly. Maybe there was a problem with the patent office website this morning.

    Chances are that there was always a different amount of PageRank passed along 通过 different links on a page, since the launch of 谷歌. Matt Cutts has said a number of times over the past few years that different links on a page pass along different amounts of PageRank.

    我真的没有兴趣讨论传销网站的优点,与之链接,讨论它们或以任何方式推广它们。

  35. 我知道很多站点使用目录内容管理系统来创建数百个甚至数千个指向其网站的链接。对于“Google轰炸案¢¢â€¢之类的事情,这似乎不像过去那样有效。

    Judging from this 谷歌 patent, it will be even less effective in the future.

  36. 嗨罗伊,

    如果有人依靠搜索引擎找到这些链接,并使用它们来“improve” the rankings of pages, then the links will also be viewed and measured and analyized to see if they might have been created to attempt to manipulate rankings. I think this is an area that 谷歌 continuously gets better at detecting on a regular basis, so that if they don’现在无法检测到它们,可能只是时间问题。

评论被关闭。