Intellectual Property and the Internet/Print version

From Wikibooks, open books for an open world
Jump to navigation Jump to search

This book looks at the history of intellectual property laws and their effects on the internet.


Intellectual property (IP) is a term referring to a number of distinct types of creations of the mind for which a set of exclusive rights are recognized—and the corresponding fields of law.[1] Under intellectual property law, owners are granted certain exclusive rights to a variety of intangible assets, such as musical, literary, and artistic works; discoveries and inventions; and words, phrases, symbols, and designs. Common types of intellectual property rights include copyrights, trademarks, patents, industrial design rights and trade secrets in some jurisdictions. The term intellectual property is used to describe many very different, unrelated legal concepts.

Although many of the legal principles governing intellectual property have evolved over centuries, it was not until the 19th century that the term "intellectual property" began to be used, and not until the late 20th century that it became commonplace in the majority of the world.[2] The British Statute of Anne 1710 and the Statute of Monopolies 1623 are now seen as the origins of copyright and patent law respectively.[3]

Contents

History[edit]

Modern usage of the term ""intellectual property" goes back at least as far as 1867 with the founding of the North German Confederation whose constitution granted legislative power over the protection of intellectual property (Schutz des geistigen Eigentums) to the confederation.[4] When the administrative secretariats established by the Paris Convention for the Protection of Industrial Property (1883) and the Berne Convention for the Protection of Literary and Artistic Works (1886) merged in 1893, they located in Berne, and also adopted the term intellectual property in their new combined title, the United International Bureaux for the Protection of Intellectual Property. The organisation subsequently relocated to Geneva in 1960, and was succeeded in 1967 with the establishment of the World Intellectual Property Organization (WIPO) by the Convention Establishing the World Intellectual Property Organization as an agency of the United Nations. According to Lemley, it was only at this point that the term really began to be used in the United States (which had not been a party to the Berne Convention),[2] and it did not enter popular usage until passage of the Bayh-Dole Act in 1980.[5]

"The history of patents does not begin with inventions, but rather with royal grants by Queen Elizabeth I (1558-1603) for monopoly privileges... Approximately 200 years after the end of Elizabeth's reign, however, a patent represents a legal [right] obtained by an inventor providing for exclusive control over the production and sale of his mechanical or scientific invention... [demonstrating] the evolution of patents from royal prerogative to common-law doctrine."[6]

In an 1818 collection of his writings, the French liberal theorist, Benjamin Constant, argued against the recently introduced idea of "property which has been called intellectual."[7] The term intellectual property can be found used in an October 1845 Massachusetts Circuit Court ruling in the patent case Davoll et al. v. Brown., in which Justice Charles L. Woodbury wrote that "only in this way can we protect intellectual property, the labors of the mind, productions and interests are as much a man's own...as the wheat he cultivates, or the flocks he rears." [8] The statement that "discoveries are...property" goes back earlier still: Section 1 of the French patent law of 1791 states, "All new discoveries are the property of the author; to assure the inventor the property and temporary enjoyment of his discovery, there shall be delivered to him a patent for five, ten or fifteen years."[9]

The concept's origins can potentially be traced back further. Jewish law includes several considerations whose effects are similar to those of modern intellectual property laws, though the notion of intellectual creations as property does not seem to exist – notably the principle of Hasagat Ge'vul (unfair encroachment) was used to justify limited-term publisher (but not author) copyright in the 16th century.[10]

Objectives[edit]

Until recently, the purpose of intellectual property law was to give as little protection possible in order to encourage innovation. Historically, therefore, they were granted only when they were necessary to encourage invention, limited in time and scope.[2] Currently, particularly in the United States, the objective of intellectual property legislators and those who support its implementation is "absolute protection". "If some intellectual property is desirable because it encourages innovation, they reason, more is better. The thinking is that creators will not have sufficient incentive to invent unless they are legally entitled to capture the full social value of their inventions."[2] This absolute protection or full value view treats intellectual property as another type of 'real' property, typically adopting its law and rhetoric.

Financial incentive[edit]

These exclusive rights allow owners of intellectual property to benefit from the property they have created, providing a financial incentive for the creation of and investment in intellectual property, and, in case of patents, pay associated research and development costs.[11] Some commentators, such as David Levine and Michele Boldrin, dispute this justification.[12]

Economic growth[edit]

The WIPO treaty and several related international agreements are premised on the notion that the protection of intellectual property rights are essential to maintaining economic growth. The WIPO Intellectual Property Handbook gives two reasons for intellectual property laws:

One is to give statutory expression to the moral and economic rights of creators in their creations and the rights of the public in access to those creations. The second is to promote, as a deliberate act of Government policy, creativity and the dissemination and application of its results and to encourage fair trading which would contribute to economic and social development.[13]

The Anti-Counterfeiting Trade Agreement (ACTA) states that "effective enforcement of intellectual property rights is critical to sustaining economic growth across all industries and globally".[14]

Economists estimate that two-thirds of the value of large businesses in the U.S. can be traced to intangible assets.[15] "IP-intensive industries" are estimated to generate 72 percent more value added (price minus material cost) per employee than "non-IP-intensive industries".[16]

A joint research project of the WIPO and the United Nations University measuring the impact of IP systems on six Asian countries found "a positive correlation between the strengthening of the IP system and subsequent economic growth."[17]

Economists have also shown that IP can be a disincentive to innovation when that innovation is drastic. IP makes excludable Rivalry (economics)|non-rival intellectual products that were previously non-excludable. This creates economic efficiency|economic inefficiency as long as the monopoly is held. A disincentive to direct resources toward innovation can occur when monopoly profits are less than the overall Welfare economics|welfare improvement to society. This situation can be seen as a market failure, and an issue of appropriability.[18]

Morality[edit]

According to Article 27 of the Universal Declaration of Human Rights, "everyone has the right to the protection of the moral and material interests resulting from any scientific, literary or artistic production of which he is the author".[19] Although the relationship between intellectual property and human rights is a complex one,[20] there are moral arguments for intellectual property.

Various moral justifications for private property can also be used to argue in favor of the morality of intellectual property, such as:

  1. Natural Rights/Justice Argument: this argument is based on Locke’s idea that a person has a natural right over the labour and/or products which is produced by his/her body. Appropriating these products is viewed as unjust. Although Locke had never explicitly stated that natural right applied to products of the mind[21], it is possible to apply his argument to intellectual property rights, in which it would be unjust for people to misuse another's ideas.[22]
  2. Utilitarian-Pragmatic Argument: according to this rationale, a society that protects private property is more effective and prosperous than societies that do not. Innovation and invention in 19th century America has been said to be attributed to the development of the patent system.[22] By providing innovators with "durable and tangible return on their investment of time, labor, and other resources", intellectual property rights seek to maximize social utility.[23] The presumption is that they promote public welfare by encouraging the "creation, production, and distribution of intellectual works".[23]
  3. "Personality” Argument: this argument is based on a quote from Hegel: “Every man has the right to turn his will upon a thing or make the thing an object of his will, that is to say, to set aside the mere thing and recreate it as his own”.[22] European intellectual property law is shaped by this notion that ideas are an “extension of oneself and of one’s personality”.[22]

Writer Ayn Rand has argued that the protection of intellectual property is essentially a moral issue. The belief is that the human mind itself is the source of wealth and survival and that all property at its base is intellectual property. To violate intellectual property is therefore no different morally than violating other property rights which compromises the very processes of survival and therefore constitutes an immoral act.[24]

Criticism[edit]

The term itself[edit]

Free Software Foundation founder Richard Stallman argues that, although the term "intellectual property" is in wide use, it should be rejected altogether, because it "systematically distorts and confuses these issues, and its use was and is promoted by those who gain from this confusion." He claims that the term "operates as a catch-all to lump together disparate laws [which] originated separately, evolved differently, cover different activities, have different rules, and raise different public policy issues" and that it creates a "bias" by confusing these monopolies with ownership of limited physical things, likening them to "property rights".[25] Stallman advocates referring to copyrights, patents and trademarks in the singular and warns against abstracting disparate laws into a collective term.

Lawrence Lessig, along with many other copyleft and free software activists, have criticized the implied analogy with physical property (like land or an automobile). They argue such an analogy fails because physical property is generally rivalrous while intellectual works are non-rivalrous (that is, if one makes a copy of a work, the enjoyment of the copy does not prevent enjoyment of the original).[26]

Limitations[edit]

Some critics of intellectual property, such as those in the free culture movement, point at intellectual monopoly privilege as harming health, preventing progress, and benefiting concentrated interests to the detriment of the masses,[27][28] and argue that the public interest is harmed by ever expansive monopolies in the form of copyright extensions, software patents and business method patents.

The Committee on Economic, Social and Cultural Rights recognizes that "conflicts may exist between the respect for and implementation of current intellectual property systems and other human rights".[29] It argues that intellectual property tends to be governed by economic goals when it should be viewed primarily as a social product; in order to serve human well-being, intellectual property systems must respect and conform to human rights laws. According to the Committee, when systems fail to do so they risk infringing upon the human right to food and health, and to cultural participation and scientific benefits.[30]

Some libertarianism|libertarian critics of intellectual property have argued that allowing property rights in ideas and information creates artificial scarcity and infringes on the right to own tangible property. Stephan Kinsella uses the following scenario to argue this point:

[I]magine the time when men lived in caves. One bright guy—let's call him Galt-Magnon—decides to build a log cabin on an open field, near his crops. To be sure, this is a good idea, and others notice it. They naturally imitate Galt-Magnon, and they start building their own cabins. But the first man to invent a house, according to IP advocates, would have a right to prevent others from building houses on their own land, with their own logs, or to charge them a fee if they do build houses. It is plain that the innovator in these examples becomes a partial owner of the tangible property (e.g., land and logs) of others, due not to first occupation and use of that property (for it is already owned), but due to his coming up with an idea. Clearly, this rule flies in the face of the first-user homesteading rule, arbitrarily and groundlessly overriding the very homesteading rule that is at the foundation of all property rights.[31]

Other criticism of intellectual property law concerns the tendency of the protections of intellectual property to expand, both in duration and in scope. The trend has been toward longer copyright protection[32] (raising fears that it may some day be eternal).[26][33][34][35] In addition, the developers and controllers of items of intellectual property have sought to bring more items under the protection. Patents have been granted for living organisms,[36] (and in the US, Plant breeders' rights|certain living organisms have been patentable for over a century)[37] and colors have been trademarked.[38] Because they are systems of government-granted monopolies copyrights, patents, and trademarks are called intellectual monopoly privileges, (IMP) a topic on which several academics, including Birgitte Andersen[39] and Thomas Alured Faunce[40] have written.

Thomas Jefferson once said in a letter to Isaac McPherson on August 13, 1813:

"If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it. Its peculiar character, too, is that no one possesses the less, because every other possesses the whole of it. He who receives an idea from me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening me."[41]

In 2005 the Royal Society for the encouragement of Arts, Manufactures & Commerce launched the Adelphi Charter, aimed at creating an international policy statement to frame how governments should make balanced intellectual property law.

Another limitation of current U.S. Intellectual Property legislation is its focus on individual and joint works; thus, copyright protection can only be obtained in 'original' works of authorship.[42] This definition excludes any works that are the result of community creativity, for example Native American songs and stories; current legislation does not recognize the uniqueness of indigenous cultural 'property' and its ever-changing nature. Simply asking native cultures to 'write down' their cultural artifacts on tangible mediums ignores their necessary orality and enforces a Western bias of the written form as more authoritative.

Ethics[edit]

The ethical problems brought up by intellectual property rights are most pertinent when it is socially valuable goods like life-saving medicines and genetically modified seeds that are given intellectual property protection. For example, pharmaceutical companies that develop new medicines, apply for intellectual property rights in order to prevent other companies from manufacturing their product without the additional cost of research and development. The application of intellectual property rights allow companies to charge higher than the marginal cost of production in order to recoup the costs of research and development.[43] However, this immediately excludes from the market anyone who cannot afford the cost of the product, in this case a life saving drug.

The availability problem is a consequence of the fact that the incentivizing mechanism for innovation constituted by IPRs establishes a direct link between the incentive to innovate and the price of the innovative product. Under an IPR driven regime, profits are generated exclusively from sales. This means that the higher a price a product can command on the market, the higher is the incentive to invest resources into the R&D process of it. An IPR driven regime is therefore not a regime that is conductive to the investment of R&D of products that are socially valuable to predominately poor populations...[43]

Further reading[edit]

Arai, Hisamitsu. "Intellectual Property Policies for the Twenty-First Century: The Japanese Experience in Wealth Creation", WIPO Publication Number 834 (E). 2000. wipo.int
Branstetter, Lee, Raymond Fishman and C. Fritz Foley. "Do Stronger Intellectual Property Rights Increase International Technology Transfer? Empirical Evidence from US Firm-Level Data". NBER Working Paper 11516. July 2005. weblog.ipcentral.info
Burk, Dan L.; Lemley, Mark A. (2009). The Patent Crisis and How the Courts Can Solve It. University of Chicago Press. ISBN 9780226080611. 
Connell, Shaun. "Intellectual Ownership". October 2007. rebithofffreedom.org
Farah, Paolo and Cima, Elena. "China’s Participation in the World Trade Organization: Trade in Goods, Services, Intellectual Property Rights and Transparency Issues" in Aurelio Lopez-Tarruella Martinez (ed.), El comercio con China. Oportunidades empresariales, incertidumbres jurídicas, Tirant lo Blanch, Valencia (Spain) 2010, pp. 85–121. ISBN 978-84-8456-981-7. Available at SSRN.com
Gowers, Andrew. "Gowers Review of Intellectual Property". Her Majesty's Treasury, November 2006. hm-treasury.gov.uk ISBN 0-118-40483-0.
Hahn, Robert W., Intellectual Property Rights in Frontier Industries: Software and Biotechnology, AEI Press, March 2005.
Kinsella, Stephan. "Against Intellectual Property". Journal of Libertarian Studies 15.2 (Spring 2001): 1-53. mises.org
Lai, Edwin. "The Economics of Intellectual Property Protection in the Global Economy". Princeton University. April 2001. dklevine.com
Lee, Richmond. Scope and Interplay of IP Rights Accralaw offices.
Lessig, Lawrence. "Free Culture: How Big Media Uses Technology and the Law to Lock Down Culture and Control Creativity". New York: Penguin Press, 2004. free-culture.cc.
Lindberg, Van. Intellectual Property and Open Source: A Practical Guide to Protecting Code. O'Reilly Books, 2008. ISBN 0-596-51796-3 | ISBN 9780596517960
Maskus, Keith E. "Intellectual Property Rights and Economic Development". Case Western Reserve Journal of International Law, Vol. 32, 471. journals/jil/32-3/maskusarticle.pdf law.case.edu
Mazzone, Jason. "Copyfraud". Brooklyn Law School, Legal Studies Paper No. 40. New York University Law Review 81 (2006): 1027. (Abstract.)
Miller, Arthur Raphael, and Michael H. Davis. Intellectual Property: Patents, Trademarks, and Copyright. 3rd ed. New York: West/Wadsworth, 2000. ISBN 0-314-23519-1.
Mossoff, A. 'Rethinking the Development of Patents: An Intellectual History, 1550-1800,' Hastings Law Journal, Vol. 52, p. 1255, 2001
Rozanski, Felix. "Developing Countries and Pharmaceutical Intellectual Property Rights: Myths and Reality" stockholm-network.org
Perelman, Michael. Steal This Idea: Intellectual Property and The Corporate Confiscation of Creativity. Palgrave Macmillan, 2004.
Rand, Ayn. "Patents and Copyrights" in Ayn Rand, ed. 'Capitalism: The Unknown Ideal,' New York: New American Library, 1966, pp. 126–128
Reisman, George. 'Capitalism: A Complete & Integrated Understanding of the Nature & Value of Human Economic Life,'] Ottawa, Illinois: 1996, pp. 388–389
Schechter, Roger E., and John R. Thomas. Intellectual Property: The Law of Copyrights, Patents and Trademarks. New York: West/Wadsworth, 2003, ISBN 0-314-06599-7.
Schneider, Patricia H. "International Trade, Economic Growth and Intellectual Property Rights: A Panel Data Study of Developed and Developing Countries". July 2004. mtholyoke.edu
Shapiro, Robert and Nam Pham. "Economic Effects of Intellectual Property-Intensive Manufacturing in the United States". July 2007. the-value-of.ip.org
Vaidhyanathan, Siva. The Anarchist in the Library: How the Clash Between Freedom and Control Is Hacking the Real World and Crashing the System. New York: Basic Books, 2004.


Intellectual Property and the Internet
Intellectual property Copyright Copyright infringement

Copyright is a legal concept, enacted by most governments, giving the creator of an original work exclusive rights to it, usually for a limited time. Generally, it is "the right to copy", but also gives the copyright holder the right to be credited for the work, to determine who may adapt the work to other forms, who may perform the work, who may financially benefit from it, and other, related rights. It is an intellectual property form (like the patent, the trademark, and the trade secret) applicable to any expressible form of an idea or information that is substantive and discrete.[44]

Copyright initially was conceived as a way for government to restrict printing; the contemporary intent of copyright is to promote the creation of new works by giving authors control of and profit from them. Copyrights are said to be territorial, which means that they do not extend beyond the territory of a specific state unless that state is a party to an international agreement. Today, however, this is less relevant since most countries are parties to at least one such agreement. While many aspects of national copyright laws have been standardized through international copyright agreements] copyright laws of most countries have some unique features.[45] Typically, the duration of copyright is the whole life of the creator plus fifty to a hundred years from the creator's death, or a finite period for anonymous or corporate creations. Some jurisdictions have required formalities to establishing copyright, but most recognize copyright in any completed work, without formal registration. Generally, copyright is enforced as a civil law matter, though some jurisdictions do apply criminal sanctions.

Most jurisdictions recognize copyright limitations, allowing "fair" exceptions to the creator's exclusivity of copyright, and giving users certain rights. The development of digital media and computer network technologies have prompted reinterpretation of these exceptions, introduced new difficulties in enforcing copyright, and inspired additional challenges to copyright law's philosophic basis. Simultaneously, businesses with great economic dependence upon copyright have advocated the extension and expansion of their copy rights, and sought additional legal and technological enforcement.

Justification[edit]

Some take the approach of looking for coherent justifications of established copyright systems, while others start with general ethical theories, such as utilitarianism and try to analyse policy through that lens. Another approach denies the meaningfulness of any ethical justification for existing copyright law, viewing it simply as a result (and perhaps an undesirable result) of political processes.

Another widely debated issue is the relationship between copyrights and other forms of "intellectual property", and material property. Most scholars of copyright agree that it can be called a kind of property, because it involves the exclusion of others from something. But there is disagreement about the extent to which that fact should allow the transportation of other beliefs and intuitions about material possessions.

There are many other philosophy|philosophical questions which arise in the jurisprudence of copyright. They include such problems as determining when one work is "derived" from another, or deciding when information has been placed in a "tangible" or "material" form.

Some critics claim copyright law protects corporate interests while criminalizing legitimate use, while proponents argue the law is fair and just.[46]

History[edit]

Copyright was invented after the advent of the printing press and with wider public literacy. As a legal concept, its origins in Britain were from a reaction to printers' monopolies at the beginning of the eighteenth century. Charles II of England was concerned by the unregulated copying of books and passed the Licensing Act of 1662 by Act of Parliament,[47] which established a register of licensed books and required a copy to be deposited with the Stationers Company, essentially continuing the licensing of material that had long been in effect.

The Statute of Anne came into force in 1710

The British Statute of Anne (1709) further alluded to individual rights of the artist, beginning: "Whereas Printers, Booksellers, and other Persons, have of late frequently taken the Liberty of Printing... Books, and other Writings, without the Consent of the Authors... to their very great Detriment, and too often to the Ruin of them and their Families:..."[48] A right to benefit financially from the work is articulated, and court rulings and legislation have recognized a right to control the work, such as ensuring that the integrity of it is preserved. An irrevocable right to be recognized as the work's creator appears in some countries' copyright laws.

Aside from the role of governments and the church, the history of copyright law is in essential ways also connected to the rise of capitalism and the attendant extension of commodity relations to the realm of creative human activities, such as literary and artistic production. Similarly, different cultural attitudes, social organizations, economic models and legal frameworks are seen to account for why copyright emerged in Europe and not, for example, in Asia. In the Middle Ages in Europe, there was generally a lack of any concept of literary property due to the general relations of production, the specific organization of literary production and the role of culture in society. The latter refers to the tendency of oral societies, such as that of Europe in the medieval period, to view knowledge as the product, expression and property of the collective. Not until capitalism emerges in Europe with its individualist ideological underpinnings does the conception of intellectual property and by extension copyright law emerge. Intellectual production comes to be seen as a product of an individual and their property, rather than a collective or social product which belongs in the commons. The most significant point is that under the capitalist mode of production, patent and copyright laws support in fundamental and thoroughgoing ways the expansion of the range of creative human activities that can be commodified. This parallels the ways in which capitalism led to the commodification of many aspects of social life that hitherto had no monetary or economic value per se.[49]

The Statute of Anne was the first real copyright act, and gave the publishers rights for a fixed period, after which the copyright expired.[50] Copyright has grown from a legal concept regulating copying rights in the publishing of books and maps to one with a significant effect on nearly every modern industry, covering such items as Sound recording and reproduction|sound recordings, films, photographs, software, and architectural works.

File:Copyright Act of 1790 in Colombian Centinel.jpg|thumb|250px|The Copyright Act of 1790 in the Columbian Centinel The Copyright Clause of the United States Constitution (1787) authorized copyright legislation: "To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries." That is, by guaranteeing them a period of time in which they alone could profit from their works, they would be enabled and encouraged to invest the time required to create them, and this would be good for society as a whole. A right to profit from the work has been the philosophical underpinning for much legislation extending the duration of copyright, to the life of the creator and beyond, to his heirs.

The 1886 Berne Convention for the Protection of Literary and Artistic Works first established recognition of copyrights among sovereign nations, rather than merely bilaterally. Under the Berne Convention, copyrights for creative works do not have to be asserted or declared, as they are automatically in force at creation: an author need not "register" or "apply for" a copyright in countries adhering to the Berne Convention.[51] As soon as a work is "fixed", that is, written or recorded on some physical medium, its author is automatically entitled to all copyrights in the work, and to any derivative works unless and until the author explicitly disclaims them, or until the copyright expires. The Berne Convention also resulted in foreign authors being treated equivalently to domestic authors, in any country signed onto the Convention. The UK signed the Berne Convention in 1887 but did not implement large parts of it until 100 years later with the passage of the Copyright, Designs and Patents Act of 1988. The USA did not sign the Berne Convention until 1989.[52]

The United States and most Latin American countries instead entered into the Buenos Aires Convention in 1910, which required a copyright notice (such as "all rights reserved") on the work, and permitted signatory nations to limit the duration of copyrights to shorter and renewable terms.[53][54][55] The Universal Copyright Convention was drafted in 1952 as another less demanding alternative to the Berne Convention, and ratified by nations such as the Soviet Union and developing nations.

The regulations of the Berne Convention are incorporated into the World Trade Organization's Agreement on Trade-Related Aspects of Intellectual Property Rights or TRIPS agreement of (1995), thus giving the Berne Convention effectively near-global application.[56] The 2002 World Intellectual Property Organization Copyright Treaty or WIPO Copyright Treaty enacted greater restrictions on the use of technology to copy works in the nations that ratified it.

Scope[edit]

Copyright may apply to a wide range of creative, intellectual, or artistic forms, or "works". Specifics vary by jurisdiction, but these can include poems, theses, drama|plays, other book|literary works, film|movies, choreography|dances, music|musical compositions, sound recording|audio recordings, paintings, drawings, sculptures, photography|photographs, Computer software|software, radio and television Broadcasting|broadcasts, and industrial designs. Graphic designs and industrial designs may have separate or overlapping laws applied to them in some jurisdictions.[57][58]

Copyright does not cover ideas and information themselves, only the form or manner in which they are expressed.[59] For example, the copyright to a Mickey Mouse cartoon restricts others from making copies of the cartoon or creating derivative works based on The Walt Disney Company's particular anthropomorphic mouse, but doesn't prohibit the creation of other works about anthropomorphic mice in general, so long as they're different enough to not be judged copies of Disney's.[59] In many jurisdictions, copyright law makes exceptions to these restrictions when the work is copied for the purpose of commentary or other related uses. Meanwhile, other laws may impose additional restrictions that copyright does not — such as trademarks and patents.

Copyright laws are standardized somewhat through international conventions such as the Berne Convention for the Protection of Literary and Artistic Works|Berne Convention and Universal Copyright Convention. These multilateral treaties have been ratified by nearly all countries, and international organizations such as the European Union or World Trade Organization require their member states to comply with them.

Obtaining and enforcing copyright[edit]

File:Fermat Last Theorem "proof" registered by Ukraine officials.jpg|thumb|left|A copyright certificate for proof of the Fermat theorem, issued by the State Department of Intellectual Property of Ukraine Typically, a work must meet minimal standards of originality in order to qualify for copyright, and the copyright expires after a set period of time (some jurisdictions may allow this to be extended). Different countries impose different tests, although generally the requirements are low; in the United Kingdom there has to be some 'skill, labour and judgment' that has gone into it.[60] In Australia and the United Kingdom it has been held that a single word is insufficient to comprise a copyright work. However, single words or a short string of words can sometimes be registered as a trademark instead.

Copyright law recognises the right of an author based on whether the work actually is an original creation, rather than based on whether it is unique; two authors may own copyright on two substantially identical works, if it is determined that the duplication was coincidental, and neither was copied from the other.

In all countries where the Berne Convention standards apply, copyright is automatic, and need not be obtained through official registration with any government office. Once an idea has been reduced to tangible form, for example by securing it in a fixed medium (such as a drawing, sheet music, photograph, a videotape, or a computer file), the copyright holder is entitled to enforce his or her exclusive rights.[51] However, while registration isn't needed to exercise copyright, in jurisdictions where the laws provide for registration, it serves as prima facie evidence of a valid copyright and enables the copyright holder to seek statutory damages for copyright infringement|statutory damages and attorney's fees. (In the USA, registering after an infringement only enables one to receive actual damages and lost profits.)

The original holder of the copyright may be the employer of the author rather than the author himself, if the work is a "work for hire".[61] For example, in English law the Copyright, Designs and Patents Act 1988 provides that if a copyrighted work is made by an employee in the course of that employment, the copyright is automatically owned by the employer which would be a "Work for Hire."

Copyrights are generally enforced by the holder in a Civil law (private law)|civil law court, but there are also criminal infringement statutes in some jurisdictions. While copyright registry|central registries are kept in some countries which aid in proving claims of ownership, registering does not necessarily prove ownership, nor does the fact of copying (even without permission) necessarily legal proof|prove that copyright was infringed. Criminal sanctions are generally aimed at serious counterfeiting activity, but are now becoming more commonplace as copyright collectives such as the RIAA are increasingly targeting the file sharing home Internet user. Thus far, however, most such cases against file sharers have been settled out of court. (See: File sharing and the law)

Copyright notices in the U.S.[edit]

File:Copyright.svg|thumb|150px|A copyright symbol used in copyright notice Prior to 1989, use of a copyright notice — consisting of the copyright symbol (©, the letter C inside a circle), the abbreviation "Copr.", or the word "Copyright", followed by the year of the first publication of the work and the name of the copyright holder — was part of United States statutory requirements.[62][63] Several years may be noted if the work has gone through substantial revisions. The proper copyright notice for sound recordings of musical or other audio works is a sound recording copyright symbol (, the letter P inside a circle), which indicates a sound recording copyright. Similarly, the phrase "all rights reserved" was once required to assert copyright.

In 1989, the U.S. enacted the Berne Convention Implementation Act, amending the 1976 Copyright Act to conform to most of the provisions of the Berne Convention for the Protection of Literary and Artistic Works|Berne Convention. As a result, the use of copyright notices has become optional to claim copyright, because the Berne Convention makes copyright automatic.[64] However, the lack of notice of copyright using these marks may have consequences in terms of reduced damages in an infringement lawsuit — using notices of this form may reduce the likelihood of a defense of "innocent infringement" being successful.

"Poor man's copyright"[edit]

A widely circulated strategy to avoid the cost of copyright registration is referred to as the "poor man's copyright." It proposes that the creator send the work to himself in a sealed envelope by registered mail, using the postmark to establish the date. This technique has not been recognized in any published opinions of the United States courts. The United States Copyright Office makes clear that the technique is no substitute for actual registration.[65] The United Kingdom Intellectual Property Office discusses the technique but does not recommend its use.[66]

Exclusive rights[edit]

Several exclusive rights typically attach to the holder of a copyright:

  • to produce copies or reproductions of the work and to sell those copies (including, typically, electronic copies)
  • to import or export the work
  • to create derivative works (works that adapt the original work)
  • to perform or display the work publicly
  • to sell or assign these rights to others
  • to transmit or display by radio or video[67]

The phrase "exclusive right" means that only the copyright holder is free to exercise those rights, and others are prohibited from using the work without the holders permission. Copyright is sometimes called a "negative right", as it serves to prohibit certain people (e.g., readers, viewers, or listeners, and primarily publishers and would be publishers) from doing something they would otherwise be able to do, rather than permitting people (e.g., authors) to do something they would otherwise be unable to do. In this way it is similar to the unregistered design right in English law and European law. The rights of the copyright holder also permit him/her to not use or exploit their copyright, for some or all of the term.

There is, however, a critique which rejects this assertion as being based on a philosophical interpretation of copyright law that is not universally shared. There is also debate on whether copyright should be considered a property right or a moral right. Many argue that copyright does not exist merely to restrict third parties from publishing ideas and information, and that defining copyright purely as a negative right is incompatible with the public policy objective of encouraging authors to create new works and enrich the public domain.

The right to adapt a work means to transform the way in which the work is expressed. Examples include developing a stage play or film script from a novel, translating a short story, and making a new arrangement of a musical work.

Founded in 2001, Creative Commons (CC) is a non-profit organization with headquarters centered in California and over 100 affiliates located throughout the world.[68] CC aims to facilitate the legal sharing of creative works. To this end the organization provides a number of copyright license options to the public, free of charge. These licenses allow copyright holders to define conditions under which others may use a work and to specify what types of use are acceptable.[68] Terms of use have traditionally been negotiated on an individual basis between copyright holder and potential licensee. Therefore, a general CC license outlining which rights the copyright holder is willing to waive enable the general public to use such works more freely. Four general types of CC licenses are available. These are based upon copyright holder stipulations such as whether the he or she is willing to allow modifications to the work, whether he or she permits the creation of derivative works and whether he or she is willing to permit commercial use of the work.[69]

Individuals may register for a CC license via the Creative Commons website. As of 2009 approximately 130 million individuals had received such licenses.[69]

Limitations and exceptions to copyright[edit]

Idea-expression dichotomy and the merger doctrine[edit]

The idea-expression divide differentiates between ideas and expression, and states that copyright protects only the original expression of ideas, and not the ideas themselves. This principle, first clarified in the 1879 case of Baker v. Selden, has since been codified by the Copyright Act of 1976 at 17 U.S.C. § 102(b).

The first-sale doctrine and exhaustion of rights[edit]

Copyright law does not restrict the owner of a copy from reselling legitimately obtained copies of copyrighted works, provided that those copies were originally produced by or with the permission of the copyright holder. It is therefore legal, for example, to resell a copyrighted book or compact disc|CD. In the United States this is known as the first-sale doctrine, and was established by the courts to clarify the legality of reselling books in second-hand bookstores. Some countries may have parallel importation restrictions that allow the copyright holder to control the aftermarket. This may mean for example that a copy of a book that does not infringe copyright in the country where it was printed does infringe copyright in a country into which it is imported for retailing. The first-sale doctrine is known as exhaustion of rights in other countries and is a principle which also applies, though somewhat differently, to patent and trademark rights. It is important to note that the first-sale doctrine permits the transfer of the particular legitimate copy involved. It does not permit making or distributing additional copies.

In addition, copyright, in most cases, does not prohibit one from acts such as modifying, defacing, or destroying his or her own legitimately obtained copy of a copyrighted work, so long as duplication is not involved. However, in countries that implement moral rights, a copyright holder can in some cases successfully prevent the mutilation or destruction of a work that is publicly visible.

Fair use and fair dealing[edit]

Copyright does not prohibit all copying or replication. In the United States, the fair use doctrine, codified by the United States Copyright Act of 1976 as 17 U.S.C. Section 107, permits some copying and distribution without permission of the copyright holder or payment to same. The statute does not clearly define fair use, but instead gives four non-exclusive factors to consider in a fair use analysis. Those factors are:

  1. the purpose and character of your use
  2. the nature of the copyrighted work
  3. what amount and proportion of the whole work was taken, and
  4. the effect of the use upon the potential market for or value of the copyrighted work.[70]

In the United Kingdom and many other Commonwealth countries, a similar notion of fair dealing was established by the courts or through legislation. The concept is sometimes not well defined; however in Canada, private copying for personal use has been expressly permitted by statute since 1999. In Australia, the fair dealing exceptions under the Copyright Act 1968 (Cth) are a limited set of circumstances under which copyrighted material can be legally copied or adapted without the copyright holder's consent. Fair dealing uses are research and study; review and critique; news reportage and the giving of professional advice (i.e. legal advice). Under current Australian law it is still a breach of copyright to copy, reproduce or adapt copyright material for personal or private use without permission from the copyright owner. Other technical exemptions from infringement may also apply, such as the temporary reproduction of a work in machine readable form for a computer.

In the United States the Audio Home Recording Act or ARHA Codified in Section 10, 1992) prohibits action against consumers making noncommercial recordings of music, in return for royalties on both media and devices plus mandatory copy-control mechanisms on recorders.

Section 1008. Prohibition on certain infringement actions
No action may be brought under this title alleging infringement of copyright based on the manufacture, importation, or distribution of a digital audio recording device, a digital audio recording medium, an analog recording device, or an analog recording medium, or based on the noncommercial use by a consumer of such a device or medium for making digital musical recordings or analog musical recordings.[71]

Later acts amended US Copyright law so that for certain purposes making 10 copies or more is construed to be commercial, but there is no general rule permitting such copying. Indeed making one complete copy of a work, or in many cases using a portion of it, for commercial purposes will not be considered fair use. The Digital Millennium Copyright Act prohibits the manufacture, importation, or distribution of devices whose intended use, or only significant commercial use, is to bypass an access or copy control put in place by a copyright owner.[57] An appellate court has held that fair use is not a defense to engaging in such distribution.

Transfer, licensing and assignment[edit]

A copyright, or aspects of it, may be assigned or transferred from one party to another.[72] For example, a musician who records an album will often sign an agreement with a record company in which the musician agrees to transfer all copyright in the recordings in exchange for royalties and other considerations. The creator (and original copyright holder) benefits, or expects to, from production and marketing capabilities far beyond those of the author. In the digital age of music, music may be copied and distributed at minimal cost through the Internet, however the record industry attempts to provide promotion and marketing for the artist and his or her work so it can reach a much larger audience. A copyright holder need not transfer all rights completely, though many publishers will insist. Some of the rights may be transferred, or else the copyright holder may grant another party a non-exclusive license to copy and/or distribute the work in a particular region or for a specified period of time. A transfer or licence may have to meet particular formal requirements in order to be effective;[72] see section 239 of the Australia Copyright Act 1968 (Cth). Under Australian law, it is not enough to pay for a work to be created in order to also own the copyright. The copyright itself must be expressly transferred in writing.

Under the U.S. Copyright Act, a transfer of ownership in copyright must be memorialized in a writing signed by the wikt:transferor|transferor. For that purpose, ownership in copyright includes exclusive licenses of rights. Thus exclusive licenses, to be effective, must be granted in a written instrument signed by the grantor. No special form of transfer or grant is required. A simple document that identifies the work involved and the rights being granted is sufficient. Non-exclusive grants (often called non-exclusive licenses) need not be in writing under Law of the United States|U.S. law. They can be oral or even implied by the behavior of the parties. Transfers of copyright ownership, including exclusive licenses, may and should be recorded in the U.S. Copyright Office. (Information on recording transfers is available on the Office's web site.) While recording is not required to make the grant effective, it offers important benefits, much like those obtained by recording a deed in a real estate transaction.

Copyright may also be licensed.[72] Some jurisdictions may provide that certain classes of copyrighted works be made available under a prescribed statutory license (e.g. musical works in the United States used for radio broadcast or performance). This is also called a compulsory license, because under this scheme, anyone who wishes to copy a covered work does not need the permission of the copyright holder, but instead merely files the proper notice and pays a set fee established by statute (or by an agency decision under statutory guidance) for every copy made.[72] Failure to follow the proper procedures would place the copier at risk of an infringement suit. Because of the difficulty of following every individual work, copyright collectives or collecting societies and performance rights organizations (such as ASCA, BMI, and SESAC have been formed to collect royalties for hundreds (thousands and more) works at once. Though this market solution bypasses the statutory license, the availability of the statutory fee still helps dictate the price per work collective rights organizations charge, driving it down to what avoidance of procedural hassle would justify.

Similar legal rights[edit]

Copyright law covers the creative or artistic expression of an idea. Patent law covers inventions. Trademark law covers distinctive signs which are used in relation to products or services as indicators of origin, as does (in a similar fashion), Trade dress. Registered designs law covers the look or appearance of a manufactured or functional article. Trade secret law covers secret or sensitive knowledge or information.[72]

Although copyright and trademark laws are theoretically distinct, more than one type of them may cover the same item or subject matter. For example, in the case of the Mickey Mouse cartoon, the image and name of Mickey Mouse would be the subject of trademark legislation, while the cartoon itself would be subject to copyright. Titles and character names from books or movies may also be trademarked while the works from which they are drawn may qualify for copyright.

Another point of distinction is that a copyright (and a patent) is generally subject to a statutorily-determined term, whereas a trademark registration may remain in force indefinitely if the trademark is periodically used and renewal fees continue to be duly paid to the relevant jurisdiction's trade marks office or Civil registry|registry. Once the term of a copyright has expired, the formerly copyrighted work enters the public domain and may be freely used or exploited by anyone. Courts in the United States and the United Kingdom have rejected the doctrine of a common law copyright. Public domain works should not be confused with works that are publicly available. Works posted in the internet for example, are publicly available, but are not generally in the public domain. Copying such works may therefore violate the author's copyright.

Useful articles[edit]

If a pictorial, graphic or sculptural work is a useful article, it is copyrighted only if its aesthetic features are separable from its utilitarian features. A useful article is an article having an intrinsic utilitarian function that is not merely to portray the appearance of the article or to convey information. They must be separable from the functional aspect to be copyrighted.[73]

There are two primary approaches to the separability issue: physical separability and conceptual separability. Physical separability is the ability to take the aesthetic thing away from the functional thing. Conceptual separability can be found in several different ways. It may be present if the useful article is also shown to be appreciated for its aesthetic appeal or by the design approach, which is the idea that separability is only available if the designer is able to make the aesthetic choices that are unaffected by the functional considerations. A question may also be asked of whether an individual would think of the aesthetic aspects of the work being separate from the functional aspects.

There are several different tests available for conceptual separability. The first, the Primary Use test, asks how is the thing primarily used: art or function? The second, the Marketable as Art test, asks can the article be sold as art, whether functional or not. This test does not have much backing, as almost anything can be sold as art. The third test, Temporal Displacement, asks could an individual conceptualize the article as art without conceptualizing functionality at the same time. Finally, the Denicola test says that copyrightability should ultimately depend on the extent to which the work reflects the artistic expression inhibited by functional consideration. If something came to have a pleasing shape because there were functional considerations, the artistic aspect was constrained by those concerns.

Accessible Copies[edit]

It is legal in several countries including the United Kingdom and the United States to produce alternative versions (for example, in large print or braille) of a copyrighted work to provide improved access to a work for blind and visually impaired persons without permission from the copyright holder.[74][75]

Duration[edit]

Expansion of U.S. copyright law (Assuming authors create their works at age 35 and live for seventy years)

Copyright subsists for a variety of lengths in different jurisdictions. The length of the term can depend on several factors, including the type of work (e.g. musical composition, novel), whether the work has been published or not, and whether the work was created by an individual or a corporation. In most of the world, the default length of copyright is the life of the author plus either 50 or 70 years. In the United States, the term for most existing works is a fixed number of years after the date of creation or publication. Under most countries' laws (for example, the United States and the United Kingdom[76]), copyrights expire at the end of the calendar year in question.

The length and requirements for copyright duration are subject to change by legislation, and since the early 20th century there have been a number of adjustments made in various countries, which can make determining the duration of a given copyright somewhat difficult. For example, the United States used to require copyrights to be renewed after 28 years to stay in force, and formerly required a copyright notice upon first publication to gain coverage. In Italy and France, there were post-wartime extensions that could increase the term by approximately 6 years in Italy and up to about 14 in France. Many countries have extended the length of their copyright terms (sometimes retroactively). International treaties establish minimum terms for copyrights, but individual countries may enforce longer terms than those.[77]

In the United States, all books and other works published before 1923 have expired copyrights and are in the public domain.[78] In addition, works published before 1964 that did not have their copyrights renewed 28 years after first publication year also are in the public domain, except that books originally published outside the US by non-Americans are exempt from this requirement, if they are still under copyright in their home country (see How Can I Tell Whether a Copyright Was Renewed for more details).

But if the intended exploitation of the work includes publication (or distribution of derivative work, such as a film based on a book protected by copyright) outside the U.S., the terms of copyright around the world must be considered. If the author has been dead more than 70 years, the work is in the public domain in most, but not all, countries. Some works are covered by copyright in Spain for 80 years after the author's death.[79]

In 1998 the length of a copyright in the United States was increased by 20 years under the Sonny Bono Copyright Term Extension Act. This legislation was strongly promoted by corporations which had valuable copyrights which otherwise would have expired, and has been the subject of substantial criticism on this point.[80]

As a curiosity, the famous work Peter Pan, or The Boy Who Wouldn't Grow Up has a complex – and disputed – story of copyright expiry.[81]

Copyright and piracy[edit]

Street hackers and mp3 pirates in Afghanistan

Piracy is considered to be the illegitimate use of materials held by copyright.[82] For the history of the term 'piracy' see Copyright infringement. For a work to be considered pirated, its illegitimate use must have occurred in a nation that has domestic copyright laws and/or adheres to a bilateral treaty or established international convention such as the Berne Convention or WIPO Copyright Treaty. Improper use of materials outside of this legislation is deemed "unauthorized edition," not piracy.[82]

Piracy primarily targets software, film and music. However, the illegal copying of books and other text works remains common, especially for educational reasons. Statistics regarding the effects of piracy are difficult to determine. Studies have attempted to estimate a monetary loss for industries affected by piracy by predicting what portion of pirated works would have been formally purchased if they had not been freely available.[83] Estimates in 2007 stated 18.2 billion potential losses in consumer dollars lost as a result of piracy activities in the United States.[84] International estimates suggest losses in the billions throughout the last decade.[84] However other reports indicate that piracy does not have an adverse effect on the entertainment industry.[85]


References[edit]

  1. Raysman, Richard; Pisacreta, Edward A.; Ostrow, Seth H.; Adler, Kenneth A. (1999). Intellectual Property Licensing: Forms and Analysis. New York, New York: Law Journal Press. ISBN 9781588520869. 
  2. a b c d Lemley, Mark A. (2005). "Property, Intellectual Property, and Free Riding". Texas Law Review 83: 1031-1033. http://web.archive.org/web/20090226035349/http://www.utexas.edu/law/journals/tlr/abstracts/83/83Lemley.pdf. "property as a common descriptor of the field probably traces to the foundation of the World Intellectual Property Organization (WIPO) by the United Nations.". 
  3. Sherman, Brad; Bently, Lionel (1999). The making of modern intellectual property law: the British experience, 1760-1911. Cambridge University Press. pp. 207. ISBN 9780521563635. http://www.google.com/books?id=u2aMRA-eF1gC&dq=statute+of+anne+copyright&lr=&as_brr=3&source=gbs_navlinks_s. 
  4. "Article 4 No. 6 of the Constitution of 1867" (in German). Hastings Law Journal 52: 1255. 2001. http://www.verfassungen.de/de/de67-18/verfassung67-i.htm. 
  5. Lemley, Mark A. (November 2005). "Property, Intellectual Property, and Free Riding". Stanford Lawyer (73): 4–5. https://law.stanford.edu/stanford-lawyer/articles/property-intellectual-property-and-free-riding/. Retrieved 2019-11-12. 
  6. Mossoff, A. (2001). "Rethinking the Development of Patents: An Intellectual History, 1550-1800" (PDF). Hastings Law Journal 52: 1255. http://papers.ssrn.com/sol3/Delivery.cfm/SSRN_ID863925_code345663.pdf. 
  7. Constant de Rebecque, Benjamin de (1818). Plancher, P.. ed. Collection complète des ouvrages publiés sur le gouvernement représentatif et la constitution actuelle de la France, formant une espèce de cours de politique constitutionnelle. p. 296. https://www.google.com/books/edition/Collection_complète_des_ouvrages_publi/XUo7AAAAcAAJ?hl=en&gbpv=0. 
  8. Davoll et al. v. Brown (PDF). 1 Woodb. & M. 53;1 2 Robb, Pat. Cas. 303; 3 West. Law J. 151; Merw. Pat. Inv. 414. at the Wayback Machine (archived March 22, 2015) Retrieved November 23, 2019.
  9. "A Brief History of the Patent Law of the United States". Legal Education Center. Ladas & Parry LLP. May 7, 2014. Archived from the original on January 26, 2016. https://web.archive.org/web/20160126014953if_/https://ladas.com/education-center/a-brief-history-of-the-patent-law-of-the-united-states-2/. Retrieved November 23, 2019. 
  10. Schneider, Israel (1998). "Jewish Law and Copyright". Jewish Law: Examining Halacha, Jewish Issues and Secular Law. Miami: Center for Halacha and American Law, Aleph Institute. p. 1. Archived from the original on March 18, 2006. https://web.archive.org/web/20060318202348if_/http://www.jlaw.com/Articles/copyright1.html. Retrieved January 17, 2012. 
  11. Schroeder, Doris; Singer, Peter (May 2009). "Prudential Reasons for IPR Reform: A Report for Innova-P2" (PDF). Melbourne, Australia: Centre for Applied Philosophy and Public Ethics. Archived from the original on May 9, 2016. https://web.archive.org/web/20160509203826if_/https://www.uclan.ac.uk/research/explore/projects/assets/cpe_innova_deliverable1_2.pdf. Retrieved November 23, 2019. 
  12. Boldrin, Michele; Levine, David K. (January 2, 2008). Against Intellectual Monopoly. Cambridge, U.K.: Cambridge University Press. doi:10.1017/cbO9780511510854. ISBN 9780521879286. OCLC 232365572. http://www.dklevine.com/general/intellectual/againstfinal.htm. Retrieved January 17, 2012.  Archived January 2, 2010 at the Wayback Machine
  13. http://www.wipo.int/export/sites/www/about-ip/en/iprm/pdf/ch1.pdf p. 3.
  14. http://web.archive.org/web/20110624131619/http://www.international.gc.ca/trade-agreements-accords-commerciaux/assets/pdfs/acta-crc_apr15-2011_eng.pdf
  15. Sonecon.com
  16. Economic Effects of Intellectual Property-Intensive Manufacturing in the United States, Robert Shapiro and Nam Pham, July 2007 (archived on archive.org).
  17. Measuring the Economic Impact of IP Systems, WIPO, 2007.
  18. Greenhalgh, C. & Rogers M., (2010). The Nature and Role of Intellectual Property. Innovation, Intellectual Property, and Economic Growth. New Jersey: Princeton University Press. (p. 32-34).
  19. United Nations. "The Universal Declaration of Human Rights". http://www.un.org/en/documents/udhr/index.shtml. Retrieved October 25, 2011. 
  20. WIPO - The World Intellectual Property Organization. "Human Rights and Intellectual Property: An Overview". http://www.wipo.int/tk/en/hr/. Retrieved October 25, 2011. 
  21. Ronald V. Bettig. "Critical Perspectives on the History and Philosophy of Copyright" in Copyrighting Culture: The Political Economy of Intellectual Property, by Ronald V. Bettig. (Boulder, CO: Westview Press, 1996), 19-20
  22. a b c d De George, Richard T.. "14. Intellectual Property Rights". in George G. Brenkert. The Oxford Handbook of Business Ethics. 1, 1st ed.. Oxford, England: Oxford University Press. pp. 408–439. 
  23. a b Spinello, Richard A. (January 2007). "Intellectual property rights". Library Hi Tech 25 (1): 12–22. doi:10.1108/07378130710735821. 
  24. Capitalism: The Unknown Ideal, Chapter 11 "Patents and Copyrights" - Ayn Rand
  25. Richard M. Stallman. "Did You Say "Intellectual Property"? It's a Seductive Mirage". Free Software Foundation, Inc. http://www.gnu.org/philosophy/not-ipr.xhtml. Retrieved 2008-03-28. 
  26. a b "Against perpetual copyright". http://wiki.lessig.org/index.php/Against_perpetual_copyright. 
  27. On patents - Daniel B. Ravicher (August 6, 2008). "Protecting Freedom In The Patent System: The Public Patent Foundation's Mission and Activities". http://www.youtube.com/watch?v=d0chez_Jf5A. 
  28. Joseph Stiglitz (October 13, 2006). "Authors@Google: Joseph Stiglitz - Making Globalization Work.". http://www.youtube.com/watch?v=UzhD7KVs-R4#t=16m05s. 
  29. WIPO - World Intellectual Property Organization. "Human Rights and Intellectual Property: An Overview". http://www.wipo.int/tk/en/hr/. Retrieved October 25, 2011. 
  30. Chapman, Audrey R. (December 2002). "The Human Rights Implications of Intellectual Property Protection". Journal of International Economic Law 5 (4): 861–882. doi:10.1093/jiel/5.4.861. 
  31. N. Stephan Kinsella, Against Intellectual property (2008), p. 44.
  32. E.g., the U.S. Copyright Term Extension Act, Pub.L. 105-298.
  33. Mark Helprin, Op-ed: A Great Idea Lives Forever. Shouldn't Its Copyright? The New York Times, May 20, 2007.
  34. Eldred v. Ashcroft Eldred v. Ashcroft, 537 U. S. 186 (2003)
  35. Mike Masnick (May 21, 2007). "Arguing For Infinite Copyright... Using Copied Ideas And A Near Total Misunderstanding Of Property". techdirt. http://www.techdirt.com/articles/20070521/015928.shtml. 
  36. Council for Responsible Genetics, DNA Patents Create Monopolies on Living Organisms. Accessed 2008.12.18.
  37. Plant Patents USPTO.gov
  38. For example, AstraZeneca holds a registered trademark to the color purple, as used in pill capsules. AstraZeneca, Nexium: Legal. Accessed 2008.12.18.
  39. Birgitte Andersen. Intellectual Property Right' Or 'IntellectualMonopoly Privilege': Which One Should PatentAnalysts Focus On? CONFERÊNCIA INTERNACIONAL SOBRE SISTEMAS DE INOVAÇÃO E ESTRATÉGIAS DE DESENVOLVIMENTO PARA O TERCEIRO MILÊNIO • NOV. 2003
  40. Martin G, Sorenson C and Faunce TA. Balancing intellectual monopoly privileges and the need for essential medicines Globalization and Health 2007, 3:4doi:10.1186/1744-8603-3-4. "Balancing the need to protect the intellectual property rights (IPRs) ("which the third author considers are more accurately described as intellectual monopoly privileges (IMPs)) of pharmaceutical companies, with the need to ensure access to essential medicines in developing countries is one of the most pressing challenges facing international policy makers today.")
  41. Thomas Jefferson, Letter to Isaac McPherson (August 13, 1813)
  42. Philip Bennet, 'Native Americans and Intellectual Property: the Necessity of Implementing Collective Ideals into Current United States Intellectual Property Laws", 2009 [1]
  43. a b Sonderholm, Jorn (2010). Ethical Issues Surrounding Intellectual Property Rights. Philosophy Compass. pp. 1108-1109. 
  44. (PDF) Understanding Copyright and Related Rights. World Intellectual Property Organisation. 2016. p. 3. ISBN 9789280527995. https://www.wipo.int/edocs/pubdocs/en/wipo_pub_909_2016.pdf. Retrieved October 25, 2019. 
  45. Mincov, Andrei. "International Copyright Law Survey". Vancouver, BC, Canada: Trademark Factory International. Archived from the original on March 4, 2016. https://web.archive.org/web/20160304070035if_/http://worldcopyrightlaw.com/copyrightsurvey. Retrieved October 25, 2019. 
  46. Boyle, James T. (October 30, 1997) [First published 1996]. "Chapter 11: The International Political Economy of Authorship". Shamans, Software and Spleens: Law and the Construction of the Information Society (1st ed.). Cambridge, Mass.: Harvard University Press. p. 142. ISBN 9780674805231. OCLC 490635851. 
  47. Patterson, Lyman Ray (1968). "Chapter 6: Copyright and Censorship". Copyright in Historical Perspective. Nashville, Tenn.: Vanderbilt University Press. pp. 136-137. ISBN 9780826513731. OCLC 442447. 
  48. Bettig, Ronald V. (1996). Copyrighting Culture: The Political Economy of Intellectual Property. Westview Press. p. 9–17. ISBN 0813313856.
  49. Ronan, Deazley (2006). Rethinking copyright: history, theory, language. Edward Elgar Publishing.. pp. 13. ISBN 9781845422820. 
  50. a b "Berne Convention for the Protection of Literary and Artistic Works Article 5". World Intellectual Property Organization. http://www.wipo.int/treaties/en/ip/berne/trtdocs_wo001.html#P109_16834. Retrieved 2011-11-18. 
  51. Garfinkle, Ann M; Fries, Janet; Lopez, Daniel; Possessky, Laura (1997). "Art conservation and the legal obligation to preserve artistic intent". JAIC 36 (2): 165–179.
  52. International Copyright Relations of the United States", U.S. Copyright Office Circular No. 38a, August 2003.
  53. Parties to the Geneva Act of the Universal Copyright Convention as of 2000-01-01: the dates given in the document are dates of ratification, not dates of coming into force. The Geneva Act came into force on 1955-09-16 for the first twelve to have ratified (which included four non-members of the Berne Union as required by Art. 9.1), or three months after ratification for other countries.
  54. Parties to the Berne Convention for the Protection of Literary and Artistic Works as of 2006-05-30.
  55. MacQueen, Hector L; Charlotte Waelde and Graeme T Laurie (2007). [Contemporary Intellectual Property: Law and Policy http://www.google.com/books?id=_Iwcn4pT0OoC&dq=contemporary+intellectual+property&source=gbs_navlinks_s]. Oxford University Press. pp. 39. ISBN 9780199263394. 
  56. World Intellectual Property Organisation. "Understanding Copyright and Related Rights" (PDF). WIPO. pp. 8. http://www.wipo.int/freepublications/en/intproperty/909/wipo_pub_909.pdf. Retrieved August 2008. 
  57. Express Newspaper Plc v News (UK) Plc, F.S.R. 36 (1991)
  58. 17 U.S.C. § 201(b); Cmty. for Creative Non-Violence v. Reid, 490 U.S. 730 (1989)
  59. Copyright Act of 1976, 90 Stat. 2541, § 401(a) (October 19, 1976)
  60. The Berne Convention Implementation Act of 1988 (BCIA), 102 Stat. 2853, 2857. One of the changes introduced by the BCIA was to section 401, which governs copyright notices on published copies, specifying that notices "may be placed on" such copies; prior to the BCIA, the statute read that notices "shall be placed on all" such copies. An analogous change was made in section 402, dealing with copyright notices on phonorecords.
  61. U.S. Copyright Office - Information Circular
  62. Copyright in General: I’ve heard about a "poor man’s copyright." What is it?, U.S Copyright Office
  63. Copyright Registers', United Kingdom Intellectual Property Office
  64. Yu, Peter K. (2007). Intellectual Property and Information Wealth: Copyright and related rights. Greenwood Publishing Group. pp. 346. ISBN 9780275988838. 
  65. a b Creative Commons Website http://creativecommons.org/ retrieved October 24, 2011.
  66. a b Rubin, R. E. (2010) 'Foundations of Library and Information Science: Third Edition', Neal-Schuman Publishers, Inc., New York, p. 341
  67. "US CODE: Title 17,107. Limitations on exclusive rights: Fair use". law.cornell.edu. 2009-05-20. http://www4.law.cornell.edu/uscode/17/107.html. 
  68. insert
  69. a b c d e WIPO Guide on the Licensing of Copyright and Related Rights. World Intellectual Property Organization. 2004. pp. 6-8,15-16. ISBN 9789280512717. http://www.google.com/books?id=LvRRvXBIi8MC&dq=copyright+transfer+and+licensing&as_brr=3&source=gbs_navlinks_s. 
  70. Copyright Law of the USA, Chapter 1 Section 121: http://www.copyright.gov/title17/92chap1.html#121
  71. Copyright (Visually Impaired Persons) Act 2002 (England): http://www.rnib.org.uk/xpedio/groups/public/documents/publicwebsite/public_cvipsact2002.hcsp
  72. The Duration of Copyright and Rights in Performances Regulations 1995, part II, Amendments of the UK Copyright, Designs and Patents Act 1988
  73. Nimmer, David (2003). Copyright: Sacred Text, Technology, and the DMCA. Kluwer Law International. p. 63. ISBN 978-9041188762. OCLC 50606064. http://books.google.com/books?id=RYfRCNxgPO4C. 
  74. "Copyright Term and the Public Domain in the United States 1 January 2008.", Cornell University.
  75. Art. 26
  76. Lawrence Lessig, Copyright's First Amendment, 48 UCLA L. Rev. 1057, 1065 (2001)
  77. "Stanford Center for Internet and Society". Web.archive.org. Archived from the original on 2006-10-27. http://web.archive.org/web/20061027134508/http://cyberlaw.stanford.edu/about/cases/emily_somma_v_gosh_peter_.shtml. Retrieved 2010-05-08. 
  78. a b Owen, Lynette (2001). "Piracy". Association of Learned and Professional Society Publishers 14 (1): 67-68. 
  79. Butler, S. Piracy Losses "Billboard" 199(36) Retrieved from http://search.proquest.com/docview/227212689?accountid=14771
  80. a b Staff (2007) Piracy Statistics around the World "Video Business" 27(28).
  81. http://www.ejpd.admin.ch/content/ejpd/de/home/dokumentation/mi/2011/2011-11-30.html

Further reading[edit]

External links[edit]


Associated Wikimedia for Copyright
Commons Wikinews Wikipedia Wikiquote Wikisource Wikiversity Wiktionary
Commons-logo.svg Gallery
Category
Wikinews-logo.svg Category Wikipedia-logo-v2.svg Article
Category
Wikiquote-logo.svg Page
Wikisource-logo.svg Portal
Category
Wikiversity-logo.svg Category Wiktionary small.svg Definition
Category
Media News Encyclopedia Quotes Texts Courses Dictionary


Copyright infringement is the unauthorized or prohibited use of works under copyright, infringing the copyright holder's exclusive rights, such as the right to reproduce or perform the copyrighted work, or to make derivative works.

Piracy[edit]

The practice of labeling the infringement of exclusive rights in creative works as "piracy" predates statutory copyright law. Prior to the Statute of Anne 1709, the Stationers' Company of London in 1557 received a Royal Charte giving the company a monopoly on publication and tasking it with enforcing the charter. Those who violated the charter were labeled pirates as early as 1603.[1] After the establishment of copyright law with the 1709 Statute of Anne in Britain, the term "piracy" has been used to refer to the unauthorized manufacturing and selling of works in copyright.[2] Article 12 of the 1886 Berne Convention for the Protection of Literary and Artistic Works uses the term "piracy" in relation to copyright infringement, stating "Pirated works may be seized on importation into those countries of the Union where the original work enjoys legal protection."[3] Article 61 of the 1994 Agreement on Trade-Related Aspects of Intellectual Property Rights (TRIPs) requires criminal procedures and penalties in cases of "wilful trademark counterfeiting or copyright piracy on a commercial scale."[4] Piracy traditionally refers to acts intentionally committed for financial gain, though more recently, copyright holders have described online copyright infringement, particularly in relation to peer-to-peer file sharing networks, as "piracy."[2]

Theft[edit]

File:Piracy is a crime - Unskippable Anti-Piracy track.png
An unskippable anti-piracy film included on some movie DVDs equates copyright infringement with theft.

Copyright holders frequently refer to copyright infringement as "theft." In copyright law, infringement does not refer to actual theft, but an instance where a person exercises one of the exclusive rights of the copyright holder without authorization.[5] Courts have distinguished between copyright infringement and theft, holding, for instance, in the United States Supreme Court case Dowling v. United States (1985) that bootleg phonorecords did not constitute stolen property and that "interference with copyright does not easily equate with theft, conversion, or fraud. The Copyright Act even employs a separate term of art to define one who misappropriates a copyright... 'an infringer of the copyright.'" In the case of copyright infringement the province guaranteed to the copyright holder by copyright law is invaded, i.e. exclusive rights, but no control, physical or otherwise, is taken over the copyright, nor is the copyright holder wholly deprived of using the copyrighted work or exercising the exclusive rights held.[6]

Enforcement responsibility[edit]

The enforcement of copyright is the responsibility of the copyright holder.[7] Article 50 of the Agreement on Trade-Related Aspects of Intellectual Property Rights (TRIPs) requires that signatory countries enable courts to remedy copyright infringement with injunctions and the destruction of infringing products, and award damages.[4] Copyright holders have started to demand through the Anti-Counterfeiting Trade Agreement (ACTA) that states act to defend copyright holders' rights and enforce copyright law through active policing of copyright infringement[8]. It has also been demanded that states provide criminal sanctions for all types of copyright infringement and pursue copyright infringement through administrative procedures, rather than the judicial due process required by TRIPs.[7]

Criminal liability[edit]

Article 61 of TRIPs requires that signatory countries establish criminal procedures and penalties in cases of "willful trademark counterfeiting or copyright piracy on a commercial scale".[4] Copyright holders have demanded that states provide criminal sanctions for all types of copyright infringement.[7]

In India Under the Copyright Act, 1957 Section 63 it has been provided that "Any person who knowingly infringes or abets the infringement of the copyright in a work shall be punishable with imprisonment which may extend to one year, or with fine, or with both."

Online intermediary liability[edit]

Whether or not internet intermediaries have liability for copyright infringement by users, and without the intermediaries' authorisation, has been subject to debate and court cases in a number of countries.[9] Liability of online intermediaries has been one of the earliest legal issues surrounding the internet. Early court cases focused on the liability of internet service providers (ISPs) for hosting, transmitting or publishing content that could be actioned under civil or criminal law, such as libel, defamation, or pornography.[10] As different content was considered in different legal systems and in the absence of common definitions for "ISPs," "bulletin boards" or "online publishers," early law on online intermediaries' liability is widely different from country to country. The first laws on online intermediaries' liability were passed from the mid 1990s onwards and the debate has shifted away from questions about whether internet intermediaries are liable for different content, such as libellous content or copyright infringing content, towards a debate on whether online intermediaries should generally be made responsible for content accessible through their services or infrastructure.[11]

Animation showing 7 remote computers exchanging data with an 8th (local) computer over a network.
The BitTorrent protocol: In this animation, the colored bars beneath all of the 7 clients in the upper region above represent the file, with each color representing an individual piece of the file. After the initial pieces transfer from the seed (large system at the bottom), the pieces are individually transferred from client to client. The original seeder only needs to send out one copy of the file for all the clients to receive a copy. To stop animation, click browser's Stop or hit ESC key.

Internet intermediaries used to be understood primarily in terms of ISPs, however, internet intermediaries are now also understood to be internet portals, software and games providers, those providing virtual information such as interactive forums and comment facilities with or without a moderation system, aggregators, universities, libraries and archives, search engines, chat rooms, web blogs, mailing lists, and any website which provides access to third party content through, for example, hyperlinks. Questions of liability have emerged in relation to internet communications infrastructure intermediaries other than ISPs, including internet backbone providers, cable companies and mobile communications providers.[12]

The US Digital Millennium Copyright Act (1998) and the European E-Commerce Directive (2000) provide online intermediaries with safe harbor provisions, known as "mere conduit principle" in the Directive. Online intermediaries who host content that infringes copyright are not liable, so long as they do not know about it and take actions once the infringing content is brought to their attention. However, questions have arisen in relation to online intermediaries that are not hosts, particularly in the context of copyright infringement through peer-to-peer file sharing networks. Such intermediaries may be regarded as enabling or assisting in the downloading and uploading of files by users, and may include the writer of a peer-to-peer software, the websites that allow users to download peer-to-peer software, and in the case of the BitTorrent protocol the torrent site website and the torrent tracker. These intermediaries do not host or transmit the files that infringe copyright, though they may be considered to be "pointing to" the files. Since the late 1990s copyright holders have taken legal actions against a number of peer-to-peer intermediaries, such as Napster, Grokster, eMule, SoulSeek and BitTorrent, and case law on the liability of internet service providers (ISPs) in relation to copyright infringement has emerged primarily in relation to these cases.[13]

The decentralized structure of peer-to-peer networks does not sit easily with existing laws on online intermediaries' liability. The BitTorrent protocol established an entirely decentraliszed network architecture in order to distribute large files effectively and recent developments in peer-to-peer technology towards more complex network configurations are said to have been driven by a desire to avoid liability as intermediaries under existing laws.[14] While ISPs and other organizations acting as online intermediaries, such as libraries, have been given protection under existing safe harbor provisions in relation to copyright infringement, peer-to-peer file sharing intermediaries have been denied access to the safe harbor provisions in relation to copyright infringement. Legal action in relation to copyright infringement against peer-to-peer intermediaries, such as Napster, are generally brought in relation to principles of secondary liability for copyright infringement, such as contributory liability and vicarious liability.[15]

Countries where sharing files without profit is legal[edit]

Downloading copied music is legal in some countries in the context of the copyright, such as Canada,[16] The Netherlands,[17] Spain,[18] and Panama, provided that the songs are not sold. In Canada it is legal to download any copyrighted file as long as it is for noncommercial use, but it is illegal to unauthorized distribute the copyrighted files like uploading them to a p2p network. [19]

Russian law[edit]

Downloading music and films for home use is legal due to exception provided by section 1273 of Russian Federation Civil Code. A special 1% compensatory levy intended for copyright holders is collected from the price of certain goods (like computers or clean CD-RW disks).[citation needed] The compensation mechanism is unclear, though, and left entirely in the hands of the collecting agency established at the same time, with Nikita Mikhalkov, a prominent film director and political figure, at its helm.


References[edit]

  1. T. Dekker Wonderfull Yeare 1603 University of Oregon
  2. a b Panethiere, Darrell (July – September 2005). "The Persistence of Piracy: The Consequences for Creativity, for Culture, and for Sustainable Development". UNESO e-Copyright Bulletin. pp. 2. http://portal.unesco.org/culture/en/files/28696/11513329261panethiere_en.pdf/panethiere_en.pdf. 
  3. Panethiere, Darrell (July – September 2005). "The Persistence of Piracy: The Consequences for Creativity, for Culture, and for Sustainable Development". UNESO e-Copyright Bulletin. pp. 14. http://portal.unesco.org/culture/en/files/28696/11513329261panethiere_en.pdf/panethiere_en.pdf. 
  4. a b c Correa, Carlos Maria; Li, Xuan (2009). Intellectual property enforcement: international perspectives. Edward Elgar Publishing. pp. 208. ISBN 9781848446632. http://books.google.com/books?id=bN3o1uwpKF4C&dq=copyright+infringement+international+acta&source=gbs_navlinks_s. 
  5. Clough, Jonathan (2010). Principles of Cybercrime. Cambridge University Press. pp. 221. ISBN 9780521728126. http://books.google.com/books?id=JVPnCqEuTksC&dq=copyright+infringement+theft&source=gbs_navlinks_s. 
  6. Dowling v. United States (1985), 473 U.S. 207, pp. 217–218.
  7. a b c Correa, Carlos Maria; Li, Xuan (2009). Intellectual property enforcement: international perspectives. Edward Elgar Publishing. pp. 211. ISBN 9781848446632. http://books.google.com/books?id=bN3o1uwpKF4C&dq=copyright+infringement+international+acta&source=gbs_navlinks_s. 
  8. "The Anti-Counterfeiting Trade Agreement – Summary of Key Elements Under Discussion" (pdf). transparency paper. Swiss federation of Intellectual Property. Status November 2009. https://www.ige.ch/fileadmin/user_upload/Juristische_Infos/e/transparency_paper.pdf. Retrieved 8 June 2010. 
  9. Edwards, Lilian; Waelde, Charlotte (2005). "Online Intermediaries and Liability for Copyright Infringement" (pdf). Keynote paper at WIPO Workshop on Online Intermediaries and Liability for Copyright, Geneva. World Intellectual Property Organisation (WIPO). p. 2. http://www.era.lib.ed.ac.uk/bitstream/1842/2305/1/wipo-onlineintermediaries.pdf. Retrieved September 2010. 
  10. Edwards, Lilian; Waelde, Charlotte (2005). "Online Intermediaries and Liability for Copyright Infringement" (pdf). Keynote paper at WIPO Workshop on Online Intermediaries and Liability for Copyright, Geneva. World Intellectual Property Organisation (WIPO). p. 4. http://www.era.lib.ed.ac.uk/bitstream/1842/2305/1/wipo-onlineintermediaries.pdf. Retrieved September 2010. 
  11. Edwards, Lilian; Waelde, Charlotte (2005). "Online Intermediaries and Liability for Copyright Infringement" (pdf). Keynote paper at WIPO Workshop on Online Intermediaries and Liability for Copyright, Geneva. World Intellectual Property Organisation (WIPO). p. 5. http://www.era.lib.ed.ac.uk/bitstream/1842/2305/1/wipo-onlineintermediaries.pdf. Retrieved September 2010. 
  12. Edwards, Lilian; Waelde, Charlotte (2005). "Online Intermediaries and Liability for Copyright Infringement" (pdf). Keynote paper at WIPO Workshop on Online Intermediaries and Liability for Copyright, Geneva. World Intellectual Property Organisation (WIPO). pp. 5–6. http://www.era.lib.ed.ac.uk/bitstream/1842/2305/1/wipo-onlineintermediaries.pdf. Retrieved September 2010. 
  13. Edwards, Lilian; Waelde, Charlotte (2005). "Online Intermediaries and Liability for Copyright Infringement" (pdf). Keynote paper at WIPO Workshop on Online Intermediaries and Liability for Copyright, Geneva. World Intellectual Property Organisation (WIPO). p. 7. http://www.era.lib.ed.ac.uk/bitstream/1842/2305/1/wipo-onlineintermediaries.pdf. Retrieved September 2010. 
  14. Edwards, Lilian; Waelde, Charlotte (2005). "Online Intermediaries and Liability for Copyright Infringement" (pdf). Keynote paper at WIPO Workshop on Online Intermediaries and Liability for Copyright, Geneva. World Intellectual Property Organisation (WIPO). p. 9. http://www.era.lib.ed.ac.uk/bitstream/1842/2305/1/wipo-onlineintermediaries.pdf. Retrieved September 2010. 
  15. Edwards, Lilian; Waelde, Charlotte (2005). "Online Intermediaries and Liability for Copyright Infringement" (pdf). Keynote paper at WIPO Workshop on Online Intermediaries and Liability for Copyright, Geneva. World Intellectual Property Organisation (WIPO). p. 10. http://www.era.lib.ed.ac.uk/bitstream/1842/2305/1/wipo-onlineintermediaries.pdf. Retrieved September 2010. 
  16. "Your Interview: Michael Geist". Canadian Broadcasting Corporation. 2008-04-07. Archived from the original on 2008-04-12. http://web.archive.org/web/20080412063751/http://www.cbc.ca/news/yourinterview/2008/04/michael_geist.html. 
  17. "In depth: Downloading music". Canadian Broadcasting Corporation. 2006-05-01. Archived from the original on 2003-12-17. http://web.archive.org/web/20031217014555/http://www.cbc.ca/news/background/internet/downloading_music.html. 
  18. "La red P2P es legal" (pdf). http://www.bufetalmeida.com/upload/file/sentenciaelrincondejesus.pdf. Retrieved May 2011. 
  19. "Canada deems P2P downloading legal". CNET News. 2003-12-12. Archived from the original on 2013-01-02. http://archive.is/W0mdI. Retrieved 2012-12-27. 

Further reading[edit]

  • Johns, Adrian: Piracy. The Intellectual Property Wars from Gutenberg to Gates. The University of Chicago Press, 2009, ISBN 978-0-226-40118-8
  • Rosen, Ronald (2008). Music and Copyright. Oxford Oxfordshire: Oxford University Press. ISBN 0195338367. 
  • Joe Karaganis, ed (2011). Media Piracy in Emerging Economies. Social Science Research Council. ISBN 978-0-9841257-4-6. 

External links[edit]

An injunction is an equitable remedy in the form of a court order that requires a party to do or refrain from doing specific acts. A party that fails to comply with an injunction faces criminal or civil penalties and may have to pay damages or accept sanctions. In some cases, breaches of injunctions are considered serious criminal offenses that merit arrest and possible prison sentences.

The term interdict is used in Scots law.[1]

Rationale and reasons for injunctions[edit]

This injunctive power to restore the status quo ante; that is, to make whole again someone whose rights have been violated, is essential to the concept of fairness (equity). For example, money damages would be of scant benefit to a land owner who wished simply to prevent someone from repeatedly trespassing on his land.

These are some common reasons for injunctions:

  • stalking
  • domestic violence
  • harassment
  • discrimination
  • bullying (in some cases)
  • physical abuse|physical or sexual abuse
  • the wrongful transfer of real property, also called fraudulent conveyance
  • the disclosure of sensitive information in line with the Official Secrets Act 1989 (UK only)
  • trademark infringement
  • copyright infringement
  • patent infringement
  • trade secrets|trade secret disclosure
  • tortious interference of contract
  • criminal contempt
  • civil contempt
  • unauthorized practice of law

Gag orders (many countries)[edit]

A gag order is an order by a court or government restricting information or comment from being made public.

American injunctions[edit]

Temporary restraining orders[edit]

A temporary restraining order (TRO) may be issued for short term. A TRO usually lasts while a motion for preliminary injunction is being decided, and the court decides whether to drop the order or to issue a preliminary injunction.

A TRO may be granted ex parte, without informing in advance the party to whom the TRO is directed. Usually, a party moves ex parte to prevent an adversary from having notice of one's intentions. The TRO is granted to prevent the adversary from acting to frustrate the purpose of the action, for example, by wasting or hiding assets (as often occurs in divorce) or disclosing a trade secret that had been the subject of a non-disclosure agreement.

To obtain a TRO, a plaintiff must prove four elements: (1) likelihood of success on the merits; (2) the extent to which the plaintiff is being irreparably harmed by the defendant's conduct; (3) the extent to which the defendant will suffer irreparable harm if the TRO issues; and (4) the public interest.[2]

Other kinds of restraining orders[edit]

Many states have injunction laws that are written specifically to stop domestic violence, stalking, sexual assault or harassment and these are commonly called restraining orders, orders of protection, abuse prevention orders, or protective orders.

Injunctions in US labor law context[edit]

After the United States government successfully used an injunction to outlaw the Pullman Strike|Pullman boycott in 1894 in In re Debs, employers found that they could obtain United States federal courts|federal court injunctions to ban strikes and organizing activities of all kinds by trade union. These injunctions were often extremely broad; one injunction issued by a federal court in the 1920s effectively barred the United Mine Workers of America from talking to workers who had signed "yellow dog contracts" with their employers.

Unable to limit what they called "government by injunction" in the courts, labor and its allies persuaded the Congress of the United States in 1932 to pass the Norris-LaGuardia Act, which imposed so many procedural and substantive limits on the federal courts' power to issue injunctions that it was an effective prohibition on federal court injunctions in cases arising out of labor disputes. A number of states followed suit and enacted "Little Norris-LaGuardia Acts" that imposed similar limitations on state courts' powers. The courts have since recognized a limited exception to the Norris-LaGuardia Act's strict limitations in those cases in which a party seeks injunctive relief to enforce the grievance arbitration provisions of a collective bargaining Contract|agreement.

Employment discrimination[edit]

Differing from most other cases, where equitable relief is given very rarely, in discrimination cases, injunctive relief is actually the preferred method of remedy.[3] However, if there is evidence that there is now a hostile relationship between the employer and employee, the court may order a reasonable amount of "front pay"[4] along with the back pay (back pay: The amount of lost wages and benefits that an employee lost ever since he was terminated, up to the point that the judgment is entered[5]).

Australian apprehended violence orders[edit]

A court may grant an apprehended violence order (AVO) to a person who fears violence, harassment, abuse, or stalking.[6] A court may issue an AVO if it believes, on the balance of probabilities, that a person has reasonable grounds to fear personal violence, harassing conduct, molestation, intimidation, or stalking. A defendant's non-compliance with the order may result in the imposition of a fine, imprisonment, or both.

UK superinjunctions[edit]

In England and Wales, injunctions whose existence and details may not be legally reported, in addition to facts or allegations which may not be disclosed, have been issued; they have been given the informal name of superinjunctions (or super-injunctions).[7][8]

An example was the superinjunction raised in September 2009 by Carter-Ruck solicitors on behalf of oil trader Trafigura, prohibiting the reporting of an internal Trafigura report into the 2006 Côte d'Ivoire toxic waste dump scandal. The existence of the superinjunction was revealed only when it was referred to in a parliamentary question that was subsequently circulated on the Internet (parliamentary privilege protects statements which would otherwise be held to be in contempt of court). Before it could be challenged in court, the injunction was then varied to permit reporting of the question.[9] By long legal tradition, parliamentary proceedings may be reported without restriction.[10] Parliamentary proceedings are only covered by qualified privilege. Another example of the use of a superinjunction was in a libel case in which a plaintiff who claimed he was defamed by family members in a dispute over a multimillion pound family trust obtained anonymity for himself and for his relatives.[11]

Roy Greenslade credits the editor of The Guardian, Alan Rusbridger, with coining the word "super-injunction" in an article about the Trafigura affair in September 2009.[12]

The term "hyper-injunction" has also been used to describe an injunction similar to a superinjunction but also including an order that the injunction must not be discussed with members of Parliament, journalists or lawyers. One known hyper-injunction was obtained at the High Court in 2006, preventing its subject from saying that paint used in water tanks on passenger ships can break down and release potentially toxic chemicals.[13][14] This example became public knowledge in Parliament under parliamentary privilege.[15]

By May 2011, Private Eye claimed to be aware of 53 super-injunctions and anonymised privacy injunctions,[16] though David Neuberger, Baron Neuberger of Abbotsbury|Lord Neuberger's report into the usage of super-injunctions revealed that only two super-injunctions had been granted since January 2010. Many media sources were wrongly stating that all gagging orders were super-injunctions.[17]


References[edit]

  1. Linklater, Magnus (23 May 2011). "Scots law is different, but it’s still a risk". London: The Times. http://www.thetimes.co.uk/tto/news/uk/scotland/article3028709.ece. Retrieved 23 May 2011. 
  2. Pappan Enters. v. Hardee's Food Sys., 143 F.3d 800, 803 (3d Cir. 1998).
  3. "Damages". http://www.workplacefairness.org/damages#4. Retrieved September 4, 2010. 
  4. "Damages". http://www.workplacefairness.org/damages#5. Retrieved September 4, 2010. 
  5. "Damages". http://www.workplacefairness.org/damages#2. Retrieved September 4, 2010. 
  6. "New South Wales - Apprehended Violence Orders". National Council of Single Mothers and Their Children. http://www.ncsmc.org.au/wsas/legal_system/avo_nsw.htm. Retrieved September 26, 2010. 
  7. Press Gazette, 14 October 2009, MPs slam 'super injunction' which gagged Guardian
  8. Robinson, James (2009-10-13). "How super-injunctions are used to gag investigative reporting". The Guardian (London). http://www.guardian.co.uk/uk/2009/oct/13/super-injunctions-guardian-carter-ruck. 
  9. http://www.publications.parliament.uk/pa/cm201011/cmhansrd/cm110317/halltext/110317h0001.htm
  10. The Guardian, 13 October 2009, Trafigura drops bid to gag Guardian over MP's question
  11. Leigh, David (29 March 2011). "Superinjunction scores legal first for nameless financier in libel action". London: guardian.co.uk. http://www.guardian.co.uk/law/2011/mar/29/superinjunction-financier-libel-legal-case. Retrieved 3 April 2011. 
  12. Greenslade, Roy (20 April 2011). "Law is badly in need of reform as celebrities hide secrets". Evening Standard. http://www.thisislondon.co.uk/markets/article-23943177-law-is-badly-in-need-of-reform-as-celebrities-hide-secrets.do. Retrieved 30 April 2011. 
  13. Swinford, Steven (21 March 2011). "'Hyper-injunction' stops you talking to MP". The Daily Telegraph (London). http://www.telegraph.co.uk/news/uknews/law-and-order/8394566/Hyper-injunction-stops-you-talking-to-MP.html. 
  14. "Now 'hyper-injunction' gagging order stops constituent speaking to his own MP". Daily Mail (London). 21 March 2011. http://www.dailymail.co.uk/news/article-1368395/Now-hyper-injunction-gagging-order-stops-constituent-speaking-MP.html. 
  15. Tim Dowling (21 March 2011). "Got secrets you want to keep? Get a hyper-injunction". The Guardian (London). http://www.guardian.co.uk/law/2011/mar/21/secrets-to-keep-hyper-injunction?INTCMP=SRCH. 
  16. "Number crunching". Private Eye (Pressdram Ltd) 1288: 5. 2011. 
  17. "Media concession made in injunction report". BBC News (BBC). 20 May 2011. http://www.bbc.co.uk/news/uk-politics-13465286. Retrieved 20 May 2011. 

External links[edit]

The Domain Name System (DNS) is a hierarchical distributed naming system for computers, services, or any resource connected to the Internet or a private network. It associates various information with domain names assigned to each of the participating entities. Most importantly, it translates domain names meaningful to humans into the numerical identifiers associated with networking equipment for the purpose of locating and addressing these devices worldwide.

An often-used analogy to explain the Domain Name System is that it serves as the phone book for the Internet by translating human-friendly computer hostnames into IP addresses. For example, the domain name www.example.com translates to the addresses 192.0.32.10 (IPv4) and 2620:0:2d0:200::10 (IPv6).

The Domain Name System makes it possible to assign domain names to groups of Internet resources and users in a meaningful way, independent of each entity's physical location. Because of this, World Wide Web (WWW) hyperlinks and Internet contact information can remain consistent and constant even if the current Internet routing arrangements change or the participant uses a mobile device. Internet domain names are easier to remember than IP addresses such as 208.77.188.166 (IPv4) or 2001:db8:1f70::999:de8:7648:6e8 (IPv6). Users take advantage of this when they recite meaningful Uniform Resource Locators (URLs) and e-mail addresses without having to know how the computer actually locates them.

The Domain Name System distributes the responsibility of assigning domain names and mapping those names to IP addresses by designating authoritative name servers for each domain. Authoritative name servers are assigned to be responsible for their particular domains, and in turn can assign other authoritative name servers for their sub-domains. This mechanism has made the DNS distributed and fault tolerant and has helped avoid the need for a single central register to be continually consulted and updated.

In general, the Domain Name System also stores other types of information, such as the list of mail servers that accept email for a given Internet domain. By providing a worldwide, distributed keyword-based redirection service, the Domain Name System is an essential component of the functionality of the Internet.

Other identifiers such as RFID tags, UPCs, international characters in email addresses and host names, and a variety of other identifiers could all potentially use DNS.[1][2]

The Domain Name System also specifies the technical functionality of this database service. It defines the DNS protocol, a detailed specification of the data structures and communication exchanges used in DNS, as part of the Internet Protocol Suite.

Template:IPstack

Overview[edit]

The Internet maintains two principal namespaces, the domain name hierarchy[3] and the Internet Protocol (IP) address spaces.[4] The Domain Name System maintains the domain name hierarchy and provides translation services between it and the address spaces. Internet name servers and a communication protocol implement the Domain Name System.[5] A DNS name server is a server that stores the DNS records for a domain name, such as address (A) records, name server (NS) records, and mail exchanger (MX) records (see also list of DNS record types); a DNS name server responds with answers to queries against its database.

History[edit]

The practice of using a name as a simpler, more memorable abstraction of a host's numerical address on a network dates back to the ARPANET era. Before the DNS was invented in 1982, each computer on the network retrieved a file called HOSTS.TXT from a computer at SRI (now SRI International).[6][7] The HOSTS.TXT file mapped names to numerical addresses. A hosts file still exists on most modern operating systems by default and generally contains a mapping of the IP address 127.0.0.1 to "localhost". Many operating systems use name resolution logic that allows the administrator to configure selection priorities for available name resolution methods.

The rapid growth of the network made a centrally maintained, hand-crafted HOSTS.TXT file unsustainable; it became necessary to implement a more scalable system capable of automatically disseminating the requisite information.

At the request of Jon Postel, Paul Mockapetris invented the Domain Name System in 1983 and wrote the first implementation. The original specifications were published by the Internet Engineering Task Force in RFC 882 and RFC 883, which were superseded in November 1987 by RFC 1034[3] and RFC 1035.[5] Several additional Request for Comments have proposed various extensions to the core DNS protocols.

In 1984, four Berkeley students—Douglas Terry, Mark Painter, David Riggle, and Songnian Zhou—wrote the first Unix implementation, called The Berkeley Internet Name Domain (BIND) Server.[8] In 1985, Kevin Dunlap of DEC significantly re-wrote the DNS implementation. Mike Karels, Phil Almquist, and Paul Vixie have maintained BIND since then. BIND was ported to the Windows NT platform in the early 1990s.

BIND was widely distributed, especially on Unix systems, and is the dominant DNS software in use on the Internet.[9] With the heavy use and resulting scrutiny of its open-source code, as well as increasingly more sophisticated attack methods, many security flaws were discovered in BIND[citation needed]. This contributed to the development of a number of alternative name server and resolver programs. BIND version 9 was written from scratch and now has a security record comparable to other modern DNS software.[citation needed]

Structure[edit]

Domain name space[edit]

The domain name space consists of a tree of domain names. Each node or leaf in the tree has zero or more resource records, which hold information associated with the domain name. The tree sub-divides into zones beginning at the root zone. A DNS zone may consist of only one domain, or may consist of many domains and sub-domains, depending on the administrative authority delegated to the manager.

The hierarchical Domain Name System, organized into zones, each served by a name server

Administrative responsibility over any zone may be divided by creating additional zones. Authority is said to be delegated for a portion of the old space, usually in the form of sub-domains, to another nameserver and administrative entity. The old zone ceases to be authoritative for the new zone.

Domain name syntax[edit]

The definitive descriptions of the rules for forming domain names appear in RFC 1035, RFC 1123, and RFC 2181. A domain name consists of one or more parts, technically called labels, that are conventionally concatenated, and delimited by dots, such as example.com.

  • The right-most label conveys the top-level domain; for example, the domain name www.example.com belongs to the top-level domain com.
  • The hierarchy of domains descends from right to left; each label to the left specifies a subdivision, or subdomain of the domain to the right. For example: the label example specifies a subdomain of the com domain, and www is a sub domain of example.com. This tree of subdivisions may have up to 127 levels.
  • Each label may contain up to 63 characters. The full domain name may not exceed a total length of 253 characters in its external dotted-label specification.[10] In the internal binary representation of the DNS the maximum length requires 255 octets of storage.[3] In practice, some domain registries may have shorter limits.[citation needed]
  • DNS names may technically consist of any character representable in an octet. However, the allowed formulation of domain names in the DNS root zone, and most other sub domains, uses a preferred format and character set. The characters allowed in a label are a subset of the ASCII character set, and includes the characters a through z, A through Z, digits 0 through 9, and the hyphen. This rule is known as the LDH rule (letters, digits, hyphen). Domain names are interpreted in case-independent manner.[11] Labels may not start or end with a hyphen.[12]
  • A hostname is a domain name that has at least one IP address associated. For example, the domain names www.example.com and example.com are also hostnames, whereas the com domain is not.

Internationalized domain names[edit]

The permitted character set of the DNS prevented the representation of names and words of many languages in their native alphabets or scripts. ICANN has approved the Internationalizing Domain Names in Applications (IDNA) system, which maps Unicode strings into the valid DNS character set using Punycode. In 2009 ICANN approved the installation of IDN country code top-level domains. In addition, many registries of the existing top level domain names (TLD)s have adopted IDNA.

Name servers[edit]

The Domain Name System is maintained by a distributed database system, which uses the client-server model. The nodes of this database are the name servers. Each domain has at least one authoritative DNS server that publishes information about that domain and the name servers of any domains subordinate to it. The top of the hierarchy is served by the root nameservers, the servers to query when looking up (resolving) a TLD.

Authoritative name server[edit]

An authoritative name server is a name server that gives answers that have been configured by an original source, for example, the domain administrator or by dynamic DNS methods, in contrast to answers that were obtained via a regular DNS query to another name server. An authoritative-only name server only returns answers to queries about domain names that have been specifically configured by the administrator.

An authoritative name server can either be a master server or a slave server. A master server is a server that stores the original (master) copies of all zone records. A slave server uses an automatic updating mechanism of the DNS protocol in communication with its master to maintain an identical copy of the master records.

Every DNS zone must be assigned a set of authoritative name servers that are installed in NS records in the parent zone.

When domain names are registered with a domain name registrar, their installation at the domain registry of a top level domain requires the assignment of a primary name server and at least one secondary name server. The requirement of multiple name servers aims to make the domain still functional even if one name server becomes inaccessible or inoperable.[13] The designation of a primary name server is solely determined by the priority given to the domain name registrar. For this purpose, generally only the fully qualified domain name of the name server is required, unless the servers are contained in the registered domain, in which case the corresponding IP address is needed as well.

Primary name servers are often master name servers, while secondary name server may be implemented as slave servers.

An authoritative server indicates its status of supplying definitive answers, deemed authoritative, by setting a software flag (a protocol structure bit), called the Authoritative Answer (AA) bit in its responses.[5] This flag is usually reproduced prominently in the output of DNS administration query tools (such as dig) to indicate that the responding name server is an authority for the domain name in question.[5]

Recursive and caching name server[edit]

In principle, authoritative name servers are sufficient for the operation of the Internet. However, with only authoritative name servers operating, every DNS query must start with recursive queries at the root zone of the Domain Name System and each user system must implement resolver software capable of recursive operation.

To improve efficiency, reduce DNS traffic across the Internet, and increase performance in end-user applications, the Domain Name System supports DNS cache servers which store DNS query results for a period of time determined in the configuration (time-to-live) of the domain name record in question. Typically, such caching DNS servers, also called DNS caches, also implement the recursive algorithm necessary to resolve a given name starting with the DNS root through to the authoritative name servers of the queried domain. With this function implemented in the name server, user applications gain efficiency in design and operation.

The combination of DNS caching and recursive functions in a name server is not mandatory; the functions can be implemented independently in servers for special purposes.

Internet service providers typically provide recursive and caching name servers for their customers. In addition, many home networking routers implement DNS caches and recursors to improve efficiency in the local network.

DNS resolvers[edit]

The client-side of the DNS is called a DNS resolver. It is responsible for initiating and sequencing the queries that ultimately lead to a full resolution (translation) of the resource sought, e.g., translation of a domain name into an IP address.

A DNS query may be either a non-recursive query or a recursive query:

  • A non-recursive query is one in which the DNS server provides a record for a domain for which it is authoritative itself, or it provides a partial result without querying other servers.
  • A recursive query is one for which the DNS server will fully answer the query (or give an error) by querying other name servers as needed. DNS servers are not required to support recursive queries.

The resolver, or another DNS server acting recursively on behalf of the resolver, negotiates use of recursive service using bits in the query headers.

Resolving usually entails iterating through several name servers to find the needed information. However, some resolvers function more simply by communicating only with a single name server. These simple resolvers (called "stub resolvers") rely on a recursive name server to perform the work of finding information for them.

Operation[edit]

Address resolution mechanism[edit]

Domain name resolvers determine the appropriate domain name servers responsible for the domain name in question by a sequence of queries starting with the right-most (top-level) domain label.

A DNS recursor consults three nameservers to resolve the address www.wikipedia.org.

The process entails:

  1. A network host is configured with an initial cache (so called hints) of the known addresses of the root nameservers. Such a hint file is updated periodically by an administrator from a reliable source.
  2. A query to one of the root servers to find the server authoritative for the top-level domain.
  3. A query to the obtained TLD server for the address of a DNS server authoritative for the second-level domain.
  4. Repetition of the previous step to process each domain name label in sequence, until the final step which returns the IP address of the host sought.

The diagram illustrates this process for the host www.wikipedia.org.

The mechanism in this simple form would place a large operating burden on the root servers, with every search for an address starting by querying one of them. Being as critical as they are to the overall function of the system, such heavy use would create an insurmountable bottleneck for trillions of queries placed every day. In practice caching is used in DNS servers to overcome this problem, and as a result, root nameservers actually are involved with very little of the total traffic.

Circular dependencies and glue records[edit]

Name servers in delegations are identified by name, rather than by IP address. This means that a resolving name server must issue another DNS request to find out the IP address of the server to which it has been referred. If the name given in the delegation is a subdomain of the domain for which the delegation is being provided, there is a circular dependency. In this case the nameserver providing the delegation must also provide one or more IP addresses for the authoritative nameserver mentioned in the delegation. This information is called glue. The delegating name server provides this glue in the form of records in the additional section of the DNS response, and provides the delegation in the answer section of the response.

For example, if the authoritative name server for example.org is ns1.example.org, a computer trying to resolve www.example.org first resolves ns1.example.org. Since ns1 is contained in example.org, this requires resolving example.org first, which presents a circular dependency. To break the dependency, the nameserver for the org top level domain includes glue along with the delegation for example.org. The glue records are address records that provide IP addresses for ns1.example.org. The resolver uses one or more of these IP addresses to query one of domain's authoritative servers, which allows it to complete the DNS query.

Record caching[edit]

Because of the large volume of DNS requests generated for the public Internet, the designers wished to provide a mechanism to reduce the load on individual DNS servers. To this end, the DNS resolution process allows for caching of records for a period of time after an answer. This entails the local recording and subsequent consultation of the copy instead of initiating a new request upstream. The time for which a resolver caches a DNS response is determined by a value called the time to live (TTL) associated with every record. The TTL is set by the administrator of the DNS server handing out the authoritative response. The period of validity may vary from just seconds to days or even weeks.

As a noteworthy consequence of this distributed and caching architecture, changes to DNS records do not propagate throughout the network immediately, but require all caches to expire and refresh after the TTL. RFC 1912 conveys basic rules for determining appropriate TTL values.

Some resolvers may override TTL values, as the protocol supports caching for up to 68 years or no caching at all. Negative caching, i.e. the caching of the fact of non-existence of a record, is determined by name servers authoritative for a zone which must include the Start of Authority (SOA) record when reporting no data of the requested type exists. The value of the MINIMUM field of the SOA record and the TTL of the SOA itself is used to establish the TTL for the negative answer.

Reverse lookup[edit]

A reverse lookup is a query of the DNS for domain names when the IP address is known. Multiple domain names may be associated with an IP address. The DNS stores IP addresses in the form of domain names as specially formatted names in pointer (PTR) records within the infrastructure top-level domain arpa. For IPv4, the domain is in-addr.arpa. For IPv6, the reverse lookup domain is ip6.arpa. The IP address is represented as a name in reverse-ordered octet representation for IPv4, and reverse-ordered nibble representation for IPv6.

When performing a reverse lookup, the DNS client converts the address into these formats, and then queries the name for a PTR record following the delegation chain as for any DNS query. For example, the IPv4 address 208.80.152.2 is represented as a DNS name as 2.152.80.208.in-addr.arpa. The DNS resolver begins by querying the root servers, which point to ARIN's servers for the 208.in-addr.arpa zone. From there the Wikimedia servers are assigned for 152.80.208.in-addr.arpa, and the PTR lookup completes by querying the wikimedia nameserver for 2.152.80.208.in-addr.arpa, which results in an authoritative response.

Client lookup[edit]

DNS resolution sequence

Users generally do not communicate directly with a DNS resolver. Instead DNS resolution takes place transparently in applications programs such as web browsers, e-mail clients, and other Internet applications. When an application makes a request that requires a domain name lookup, such programs send a resolution request to the DNS resolver in the local operating system, which in turn handles the communications required.

The DNS resolver will almost invariably have a cache (see above) containing recent lookups. If the cache can provide the answer to the request, the resolver will return the value in the cache to the program that made the request. If the cache does not contain the answer, the resolver will send the request to one or more designated DNS servers. In the case of most home users, the Internet service provider to which the machine connects will usually supply this DNS server: such a user will either have configured that server's address manually or allowed DHCP to set it; however, where systems administrators have configured systems to use their own DNS servers, their DNS resolvers point to separately maintained nameservers of the organization. In any event, the name server thus queried will follow the process outlined above, until it either successfully finds a result or does not. It then returns its results to the DNS resolver; assuming it has found a result, the resolver duly caches that result for future use, and hands the result back to the software which initiated the request.

Broken resolvers[edit]

An additional level of complexity emerges when resolvers violate the rules of the DNS protocol. A number of large ISPs have configured their DNS servers to violate rules (presumably to allow them to run on less-expensive hardware than a fully compliant resolver), such as by disobeying TTLs, or by indicating that a domain name does not exist just because one of its name servers does not respond.[14]

As a final level of complexity, some applications (such as web-browsers) also have their own DNS cache, in order to reduce the use of the DNS resolver library itself. This practice can add extra difficulty when debugging DNS issues, as it obscures the freshness of data, and/or what data comes from which cache. These caches typically use very short caching times—on the order of one minute.[citation needed]

Internet Explorer represents a notable exception: versions up to IE 3.x cache DNS records for 24 hours by default. Internet Explorer 4.x and later versions (up to IE 8) decrease the default time out value to half an hour, which may be changed in corresponding registry keys.[15]

Other applications[edit]

The system outlined above provides a somewhat simplified scenario. The Domain Name System includes several other functions:

  • Hostnames and IP addresses do not necessarily match on a one-to-one basis. Multiple hostnames may correspond to a single IP address: combined with virtual hosting, this allows a single machine to serve many web sites. Alternatively a single hostname may correspond to many IP addresses: this can facilitate fault tolerance and load distribution, and also allows a site to move physical location seamlessly.
  • There are many uses of DNS besides translating names to IP addresses. For instance, Mail transfer agents use DNS to find out where to deliver e-mail for a particular address. The domain to mail exchanger mapping provided by MX records accommodates another layer of fault tolerance and load distribution on top of the name to IP address mapping.
  • E-mail Blacklists: The DNS system is used for efficient storage and distribution of IP addresses of blacklisted e-mail hosts. The usual method is putting the IP address of the subject host into the sub-domain of a higher level domain name, and resolve that name to different records to indicate a positive or a negative. Here is a hypothetical example blacklist:
    • 102.3.4.5 is blacklisted => Creates 5.4.3.102.blacklist.example and resolves to 127.0.0.1
    • 102.3.4.6 is not => 6.4.3.102.blacklist.example is not found, or default to 127.0.0.2
    • E-mail servers can then query blacklist.example through the DNS mechanism to find out if a specific host connecting to them is in the blacklist. Today many of such blacklists, either free or subscription-based, are available mainly for use by email administrators and anti-spam software.
  • Software Updates: many anti-virus and commercial software now use the DNS system to store version numbers of the latest software updates so client computers do not need to connect to the update servers every time. For these types of applications, the cache time of the DNS records are usually shorter.
  • Sender Policy Framework and DomainKeys, instead of creating their own record types, were designed to take advantage of another DNS record type, the TXT record.
  • To provide resilience in the event of computer failure, multiple DNS servers are usually provided for coverage of each domain, and at the top level, thirteen very powerful root servers exist, with additional "copies" of several of them distributed worldwide via Anycast.
  • Dynamic DNS (sometimes called DDNS) allows clients to update their DNS entry as their IP address changes, as it does, for example, when moving between ISPs or mobile hot spots.

Protocol details[edit]

DNS primarily uses User Datagram Protocol (UDP) on port number 53 to serve requests.[5] DNS queries consist of a single UDP request from the client followed by a single UDP reply from the server. The Transmission Control Protocol (TCP) is used when the response data size exceeds 512 bytes, or for tasks such as zone transfers. Some resolver implementations use TCP for all queries.

DNS resource records[edit]

A Resource Record (RR) is the basic data element in the domain name system. Each record has a type (A, MX, etc.), an expiration time limit, a class, and some type-specific data. Resource records of the same type define a resource record set (RRset). The order of resource records in a set, returned by a resolver to an application, is undefined, but often servers implement round-robin ordering to achieve load balancing. DNSSEC, however, works on complete resource record sets in a canonical order.

When sent over an IP network, all records use the common format specified in RFC 1035:[16]

RR (Resource record) fields
Field Description Length (octets)
NAME Name of the node to which this record pertains (variable)
TYPE Type of RR in numeric form (e.g. 15 for MX RRs) 2
CLASS Class code 2
TTL Count of seconds that the RR stays valid (The maximum is 231-1, which is about 68 years.) 4
RDLENGTH Length of RDATA field 2
RDATA Additional RR-specific data (variable)

NAME is the fully qualified domain name of the node in the tree. On the wire, the name may be shortened using label compression where ends of domain names mentioned earlier in the packet can be substituted for the end of the current domain name.

TYPE is the record type. It indicates the format of the data and it gives a hint of its intended use. For example, the A record is used to translate from a domain name to an IPv4 address, the NS record lists which name servers can answer lookups on a DNS zone, and the MX record specifies the mail server used to handle mail for a domain specified in an e-mail address (see also List of DNS record types).

RDATA is data of type-specific relevance, such as the IP address for address records, or the priority and hostname for MX records. Well known record types may use label compression in the RDATA field, but "unknown" record types must not (RFC 3597).

The CLASS of a record is set to IN (for Internet) for common DNS records involving Internet hostnames, servers, or IP addresses. In addition, the classes Chaos (CH) and Hesiod (HS) exist.[17] Each class is an independent name space with potentially different delegations of DNS zones.

In addition to resource records defined in a zone file, the domain name system also defines several request types that are used only in communication with other DNS nodes (on the wire), such as when performing zone transfers (AXFR/IXFR) or for EDNS (OPT).

Wildcard DNS records[edit]

The domain name system supports wildcard domain names which are names that start with the asterisk label, '*', e.g., *.example.[3][18] DNS records belonging to wildcard domain names specify rules for generating resource records within a single DNS zone by substituting whole labels with matching components of the query name, including any specified descendants. For example, in the DNS zone x.example, the following configuration specifies that all subdomains (including subdomains of subdomains) of x.example use the mail exchanger a.x.example. The records for a.x.example are needed to specify the mail exchanger. As this has the result of excluding this domain name and its subdomains from the wildcard matches, all subdomains of a.x.example must be defined in a separate wildcard statement.

The role of wildcard records was refined in RFC 4592, because the original definition in RFC 1034 was incomplete and resulted in misinterpretations by implementers.[18]

Protocol extensions[edit]

The original DNS protocol had limited provisions for extension with new features. In 1999, Paul Vixie published in RFC 2671 an extension mechanism, called Extension mechanisms for DNS (EDNS) that introduced optional protocol elements without increasing overhead when not in use. This was accomplished through the OPT pseudo-resource record that only exists in wire transmissions of the protocol, but not in any zone files. Initial extensions were also suggested (EDNS0), such as increasing the DNS message size in UDP datagrams.

Dynamic zone updates[edit]

Dynamic DNS updates use the UPDATE DNS opcode to add or remove resource records dynamically from a zone data base maintained on an authoritative DNS server. The feature is described in RFC 2136. This facility is useful to register network clients into the DNS when they boot or become otherwise available on the network. Since a booting client may be assigned a different IP address each time from a DHCP server, it is not possible to provide static DNS assignments for such clients.

Security issues[edit]

Originally, security concerns were not major design considerations for DNS software or any software for deployment on the early Internet, as the network was not open for participation by the general public. However, the expansion of the Internet into the commercial sector in the 1990s changed the requirements for security measures to protect data integrity and user authentication.

Several vulnerability issues were discovered and exploited by malicious users. One such issue is DNS cache poisoning, in which data is distributed to caching resolvers under the pretense of being an authoritative origin server, thereby polluting the data store with potentially false information and long expiration times (time-to-live). Subsequently, legitimate application requests may be redirected to network hosts operated with malicious intent.

DNS responses are traditionally not cryptographically signed, leading to many attack possibilities; the Domain Name System Security Extensions (DNSSEC) modify DNS to add support for cryptographically signed responses. Several extensions have been devised to secure zone transfers as well.

Some domain names may be used to achieve spoofing effects. For example, paypal.com and paypa1.com are different names, yet users may be unable to distinguish them in a graphical user interface depending on the user's chosen typeface. In many fonts the letter l and the numeral 1 look very similar or even identical. This problem is acute in systems that support internationalized domain names, since many character codes in ISO 10646, may appear identical on typical computer screens. This vulnerability is occasionally exploited in phishing.[19]

Techniques such as forward-confirmed reverse DNS can also be used to help validate DNS results.

Domain name registration[edit]

The right to use a domain name is delegated by domain name registrars which are accredited by the Internet Corporation for Assigned Names and Numbers (ICANN), the organization charged with overseeing the name and number systems of the Internet. In addition to ICANN, each top-level domain (TLD) is maintained and serviced technically by an administrative organization, operating a registry. A registry is responsible for maintaining the database of names registered within the TLD it administers. The registry receives registration information from each domain name registrar authorized to assign names in the corresponding TLD and publishes the information using a special service, the whois protocol.

ICANN publishes the complete list of TLD registries and domain name registrars. Registrant information associated with domain names is maintained in an online database accessible with the WHOIS service. For most of the more than 240 country code top-level domains (ccTLDs), the domain registries maintain the WHOIS (Registrant, name servers, expiration dates, etc.) information. For instance, DENIC, Germany NIC, holds the DE domain data. Since about 2001, most gTLD registries have adopted this so-called thick registry approach, i.e. keeping the WHOIS data in central registries instead of registrar databases.

For COM and NET domain names, a thin registry model is used: the domain registry (e.g. VeriSign) holds basic WHOIS (registrar and name servers, etc.) data. One can find the detailed WHOIS (registrant, name servers, expiry dates, etc.) at the registrars.

Some domain name registries, often called network information centers (NIC), also function as registrars to end-users. The major generic top-level domain registries, such as for the COM, NET, ORG, INFO domains, use a registry-registrar model consisting of many domain name registrars[20][21] In this method of management, the registry only manages the domain name database and the relationship with the registrars. The registrants (users of a domain name) are customers of the registrar, in some cases through additional layers of resellers.

Internet standards[edit]

The Domain Name System is defined by Request for Comments (RFC) documents published by the Internet Engineering Task Force (Internet standards). The following is a list of RFCs that define the DNS protocol.

  • RFC 920, Domain Requirements – Specified original top-level domains
  • RFC 1032, Domain Administrators Guide
  • RFC 1033, Domain Administrators Operations Guide
  • RFC 1034, Domain Names - Concepts and Facilities
  • RFC 1035, Domain Names - Implementation and Specification
  • RFC 1101, DNS Encodings of Network Names and Other Types
  • RFC 1123, Requirements for Internet Hosts—Application and Support
  • RFC 1178, Choosing a Name for Your Computer (FYI 5)
  • RFC 1183, New DNS RR Definitions
  • RFC 1591, Domain Name System Structure and Delegation (Informational)
  • RFC 1912, Common DNS Operational and Configuration Errors
  • RFC 1995, Incremental Zone Transfer in DNS
  • RFC 1996, A Mechanism for Prompt Notification of Zone Changes (DNS NOTIFY)
  • RFC 2100, The Naming of Hosts (Informational)
  • RFC 2136, Dynamic Updates in the domain name system (DNS UPDATE)
  • RFC 2181, Clarifications to the DNS Specification
  • RFC 2182, Selection and Operation of Secondary DNS Servers
  • RFC 2308, Negative Caching of DNS Queries (DNS NCACHE)
  • RFC 2317, Classless IN-ADDR.ARPA delegation (BCP 20)
  • RFC 2671, Extension Mechanisms for DNS (EDNS0)
  • RFC 2672, Non-Terminal DNS Name Redirection
  • RFC 2845, Secret Key Transaction Authentication for DNS (TSIG)
  • RFC 3225, Indicating Resolver Support of DNSSEC
  • RFC 3226, DNSSEC and IPv6 A6 aware server/resolver message size requirements
  • RFC 3597, Handling of Unknown DNS Resource Record (RR) Types
  • RFC 3696, Application Techniques for Checking and Transformation of Names (Informational)
  • RFC 4343, Domain Name System (DNS) Case Insensitivity Clarification
  • RFC 4592, The Role of Wildcards in the Domain Name System
  • RFC 4635, HMAC SHA TSIG Algorithm Identifiers
  • RFC 4892, Requirements for a Mechanism Identifying a Name Server Instance (Informational)
  • RFC 5001, DNS Name Server Identifier (NSID) Option
  • RFC 5452, Measures for Making DNS More Resilient against Forged Answers
  • RFC 5625, DNS Proxy Implementation Guidelines (BCP 152)
  • RFC 5890, Internationalized Domain Names for Applications (IDNA):Definitions and Document Framework
  • RFC 5891, Internationalized Domain Names in Applications (IDNA): Protocol
  • RFC 5892, The Unicode Code Points and Internationalized Domain Names for Applications (IDNA)
  • RFC 5893, Right-to-Left Scripts for Internationalized Domain Names for Applications (IDNA)
  • RFC 5894, Internationalized Domain Names for Applications (IDNA):Background, Explanation, and Rationale (Informational)
  • RFC 5895, Mapping Characters for Internationalized Domain Names in Applications (IDNA) 2008 (Informational)
  • RFC 6195, Domain Name System (DNS) IANA Considerations (BCP 42)

Security[edit]

  • RFC 4033, DNS Security Introduction and Requirements
  • RFC 4034, Resource Records for the DNS Security Extensions
  • RFC 4035, Protocol Modifications for the DNS Security Extensions
  • RFC 4509, Use of SHA-256 in DNSSEC Delegation Signer (DS) Resource Records
  • RFC 4470, Minimally Covering NSEC Records and DNSSEC On-line Signing
  • RFC 5011, Automated Updates of DNS Security (DNSSEC) Trust Anchors
  • RFC 5155, DNS Security (DNSSEC) Hashed Authenticated Denial of Existence
  • RFC 5702, Use of SHA-2 Algorithms with RSA in DNSKEY and RRSIG Resource Records for DNSSEC
  • RFC 5910, Domain Name System (DNS) Security Extensions Mapping for the Extensible Provisioning Protocol (EPP)
  • RFC 5933, Use of GOST Signature Algorithms in DNSKEY and RRSIG Resource Records for DNSSEC

References[edit]

  1. Mockapetris, Paul (2004-01-02). "Letting DNS Loose". CircleID. http://www.circleid.com/posts/letting_dns_loose/. "RFID tags, UPC codes, International characters in email addresses and host names, and a variety of other identifiers could all go into DNS [...] — it's ready to carry arbitrary identifiers." 
  2. Mockapetris, Paul (April 1989). "RFC 1101: DNS Encoding of Network Names and Other Types". p. 1. "The DNS is extensible and can be used for a virtually unlimited number of data types, name spaces, etc." 
  3. a b c d RFC 1034, Domain Names - Concepts and Facilities, P. Mockapetris, The Internet Society (November 1987) Invalid <ref> tag; name "rfc1034" defined multiple times with different content
  4. RFC 781, Internet Protocol - DARPA Internet Program Protocol Specification, Information Sciences Institute, J. Postel (Ed.), The Internet Society (September 1981)
  5. a b c d e RFC 1035, Domain Names - Implementation and Specification, P. Mockapetris, The Internet Society (November 1987) Invalid <ref> tag; name "rfc1035" defined multiple times with different content
  6. RFC 3467, Role of the Domain Name System (DNS), J.C. Klensin, J. Klensin (February 2003)
  7. Cricket Liu, Paul Albitz (2006). DNS and BIND (5th ed.). O'Reilly. p. 3. http://oreilly.com/catalog/9780596100575. 
  8. Douglas Brian Terry, Mark Painter, David W. Riggle and Songnian Zhou, The Berkeley Internet Name Domain Server, Proceedings USENIX Summer Conference, Salt Lake City, Utah, June 1984, pages 23–31.
  9. "DNS Server Survey". http://mydns.bboy.net/survey/. 
  10. RFC 2181, Clarifications to the DNS Specification, R. Elz, R. Bush (July 1997)
  11. Network Working Group of the IETF, January 2006, RFC 4343: Domain Name System (DNS) Case Insensitivity Clarification
  12. RFC 3696, Application Techniques for Checking and Transformation of Names, J.C. Klensin, J. Klensin
  13. "Name Server definition at techterms.com". http://www.techterms.com/definition/nameserver. 
  14. "Providers ignoring DNS TTL ?". Slashdot. 2005. http://ask.slashdot.org/article.pl?sid=05/04/18/198259. Retrieved 2009-01-03. 
  15. "How Internet Explorer uses the cache for DNS host entries". Microsoft Corporation. 2004. http://support.microsoft.com/default.aspx?scid=KB;en-us;263558. Retrieved 2010-07-25. 
  16. RFC 5395, Domain Name System (DNS) IANA Considerations, D. Eastlake 3rd (November 2008), Section 3
  17. RFC 5395, Domain Name System (DNS) IANA Considerations, D. Eastlake 3rd (November 2008), p. 11
  18. a b RFC 4592, The Role of Wildcards in the Domain Name System, E. Lewis (July 2006)
  19. APWG. "Global Phishing Survey: Domain Name Use and Trends in 1H2010." 10/15/2010 apwg.org
  20. ICANN accredited registrars
  21. VeriSign COM and NET registry

External links[edit]

An Internet service provider or ISP is a company that provides access to the Internet. Access ISPs directly connect customers to the Internet using copper wires, wireless or fiber-optic connections.[1] Hosting ISPs lease server space for smaller businesses and other people in a process called colocation. Transit ISPs provide large amounts of bandwidth for connecting hosting ISPs to access ISPs.[2]

Internet connectivity options from end-user to Tier 3/2 ISPs

History[edit]

The Internet started off as a closed network between government research laboratories and relevant parts of universities. As it became more popular, universities and colleges started giving more of their members access to it. As a result of its popularity, commercial Internet service providers sprang up to offer access to the Internet to anyone willing to pay for the service, mainly to those who missed their university accounts. In 1990, Brookline, Massachusetts-based The World became the first commercial ISP.[3]

Access provider[edit]

ISPs employ a range of technologies to enable consumers to connect to their network.

For users and small businesses, traditional options include: Dial-up internet access|dial-up, DSL (typically Asymmetric Digital Subscriber Line, ADSL), broadband wireless, cable modem, fiber to the premises (FTTH), and Integrated Services Digital Network (ISDN) (typically basic rate interface). For customers with more demanding requirements, such as medium-to-large businesses, or other ISPs, DSL (often Single-Pair High-speed Digital Subscriber Line or ADSL), Ethernet, Metropolythian Ethernet, Gigabit Ethernet, Frame Relay, ISDN (Basic rate interface|B.R.I. or Primary rate interface|P.R.I.), Asynchronous Transfer Mode|ATM (Asynchronous Transfer Mode) and upload satellite Internet access. Synchronous optical networking (SONET) are more likely to be used.

Typical home user connectivity
  • Broadband wireless access
  • Cable Internet
  • Dial-up
    • ISDN
    • Modem
  • Digital subscriber line (DSL)
  • FTTH
  • Wi-Fi
Business-type connection:
  • Digital subscriber line (DSL)
  • Metro Ethernet network technology
  • Leased line
  • SHDSL

Locality[edit]

When using a dial-up or ISDN connection method, the ISP cannot determine the caller's physical location to more detail than using the number transmitted using an appropriate form of Caller ID; it is entirely possible to e.g. connect to an ISP located in Mexico from the USA. Other means of connection such as DOCSIS cable or DSL require a fixed registered connection node, usually associated at the ISP with a physical address.

Mailbox provider[edit]

A company or organization that provides email mailbox hosting services for end users and/or organizations. Many Mailbox Providers are also Access Providers.

Hosting ISPs[edit]

Hosting ISPs routinely provide email, FTP, and web-hosting services. Other services include virtual machines, clouds, or entire physical servers where customers can run their own custom software.

Transit ISPs[edit]

Internet Connectivity Distribution & Core.svg

Just as their customers pay them for Internet access, ISPs themselves pay upstream ISPs for Internet access. An upstream ISP usually has a larger network than the contracting ISP and/or is able to provide the contracting ISP with access to parts of the Internet the contracting ISP by itself has no access to.

In the simplest case, a single connection is established to an upstream ISP and is used to transmit data to or from areas of the Internet beyond the home network; this mode of interconnection is often cascaded multiple times until reaching a Tier 1 carrier. In reality, the situation is often more complex. ISPs with more than one point of presence (PoP) may have separate connections to an upstream ISP at multiple PoPs, or they may be customers of multiple upstream ISPs and may have connections to each one of them at one or more point of presence.

Peering[edit]

ISPs may engage in peering, where multiple ISPs interconnect at peering points or Internet exchange points (IXs), allowing routing of data between each network, without charging one another for the data transmitted—data that would otherwise have passed through a third upstream ISP, incurring charges from the upstream ISP.

ISPs requiring no upstream and having only customers (end customers and/or peer ISPs) are called "Tier 1 carriers".

Network hardware, software and specifications, as well as the expertise of network management personnel are important in ensuring that data follows the most efficient route, and upstream connections work reliably. A tradeoff between cost and efficiency is possible.

Derivatives[edit]

The following are not a different type of the above ISPs, rather they are derivatives of the 3 core ISP types. A VISP is reselling either access or hosting services. Free ISPs are similar, but they just have a different revenue model.

Virtual ISP[edit]

A Virtual ISP (VISP) is an operation which purchases services from another ISP (sometimes called a "wholesale ISP" in this context)[4] which allow the VISP's customers to access the Internet using services and infrastructure owned and operated by the wholesale ISP.

Free ISP[edit]

Free ISPs are Internet Service Providers (ISPs) which provide service free of charge. Many free ISPs display advertisements while the user is connected; like commercial television, in a sense they are selling the users' attention to the advertiser. Other free ISPs, often called freenets, are run on a nonprofit basis, usually with volunteer staff.


References[edit]

Cloud computing logical diagram

Cloud computing is the delivery of computing as a service rather than a product, whereby shared resources, software, and information are provided to computers and other devices as a metered service over a network (typically the Internet).

Cloud computing is a marketing term for technologies that provide computation, software, data access, and storage services that do not require end-user knowledge of the physical location and configuration of the system that delivers the services. A parallel to this concept can be drawn with the electricity grid, wherein end-users consume power without needing to understand the component devices or infrastructure required to provide the service.

Also, it is a delivery model for IT services based on Internet protocols, and it typically involves provisioning of dynamically scalable and often virtualized resources.[1][2] It is a byproduct and consequence of the ease-of-access to remote computing sites provided by the Internet.[3] This may take the form of web-based tools or applications that users can access and use through a web browser as if the programs were installed locally on their own computers.[4]

Cloud computing providers deliver applications via the internet, which are accessed from web browsers and desktop and mobile apps, while the business software and data are stored on servers at a remote location. In some cases, legacy applications (line of business applications that until now have been prevalent in thin client Windows computing) are delivered via a screen-sharing technology, while the computing resources are consolidated at a remote data centre location; in other cases, entire business applications have been coded using web-based technologies such as AJAX.

At the foundation of cloud computing is the broader concept of infrastructure convergence (or Converged Infrastructure) and shared services.[5] This type of data centre environment allows enterprises to get their applications up and running faster, with easier manageability and less maintenance, and enables IT to more rapidly adjust IT resources (such as servers, storage, and networking) to meet fluctuating and unpredictable business demand.[6][7]

Most cloud computing infrastructures consist of services delivered through shared data centres, which appear to consumers as a single point of access for their computing needs. Commercial offerings may be required to meet service-level agreements (SLAs), but specific terms are less often negotiated by smaller companies.[8][9]

The tremendous impact of cloud computing on business has prompted the United States federal government to look to the cloud as a means to reorganize its IT infrastructure and to decrease its IT budgets. With the advent of the top government officially mandating cloud adoption, many government agencies already have at least one or more cloud systems online.[10]

Comparison[edit]

Cloud computing shares characteristics with:

Characteristics[edit]

Cloud computing exhibits the following key characteristics:

  • Empowerment of end-users of computing resources by putting the provisioning of those resources in their own control, as opposed to the control of a centralized IT service (for example)
  • Agility improves with users' ability to re-provision technological infrastructure resources.
  • Application programming interface (API) accessibility to software that enables machines to interact with cloud software in the same way the user interface facilitates interaction between humans and computers. Cloud computing systems typically use REST-based APIs.
  • Cost is claimed to be reduced and in a public cloud delivery model capital expenditure is converted to operational expenditure.[14] This is purported to lower barriers to entry, as infrastructure is typically provided by a third-party and does not need to be purchased for one-time or infrequent intensive computing tasks. Pricing on a utility computing basis is fine-grained with usage-based options and fewer IT skills are required for implementation (in-house).
  • Device and location independence[15] enable users to access systems using a web browser regardless of their location or what device they are using (e.g., PC, mobile phone). As infrastructure is off-site (typically provided by a third-party) and accessed via the Internet, users can connect from anywhere.[16]
  • Multi-tenancy enables sharing of resources and costs across a large pool of users thus allowing for:
    • Centralization of infrastructure in locations with lower costs (such as real estate, electricity, etc.)
    • Peak-load capacity increases (users need not engineer for highest possible load-levels)
    • Utilisation and efficiency improvements for systems that are often only 10–20% utilised.[17]
  • Reliability is improved if multiple redundant sites are used, which makes well-designed cloud computing suitable for business continuity and disaster recovery.[18]
  • Scalability and Elasticity via dynamic ("on-demand") provisioning of resources on a fine-grained, self-service basis near real-time, without users having to engineer for peak loads.[19][20]
  • Performance is monitored, and consistent and loosely coupled architectures are constructed using web services as the system interface.[16]
  • Security could improve due to centralization of data, increased security-focused resources, etc., but concerns can persist about loss of control over certain sensitive data, and the lack of security for stored kernels.[21] Security is often as good as or better than other traditional systems, in part because providers are able to devote resources to solving security issues that many customers cannot afford.[22] However, the complexity of security is greatly increased when data is distributed over a wider area or greater number of devices and in multi-tenant systems that are being shared by unrelated users. In addition, user access to security audit logs may be difficult or impossible. Private cloud installations are in part motivated by users' desire to retain control over the infrastructure and avoid losing control of information security.
  • Maintenance of cloud computing applications is easier, because they do not need to be installed on each user's computer.

History[edit]

The term "cloud" is used as a metaphor for the Internet, based on the cloud drawing used in the past to represent the telephone network,[23] and later to depict the Internet in computer network diagrams as an abstraction of the underlying infrastructure it represents.[24]

Cloud computing is a natural evolution of the widespread adoption of virtualisation, service-oriented architecture, autonomic, and utility computing. Details are abstracted from end-users, who no longer have need for expertise in, or control over, the technology infrastructure "in the cloud" that supports them. The underlying concept of cloud computing dates back to the 1960s, when John McCarthy opined that "computation may someday be organised as a public utility." Almost all the modern-day characteristics of cloud computing (elastic provision, provided as a utility, online, illusion of infinite supply), the comparison to the electricity industry and the use of public, private, government, and community forms, were thoroughly explored in Douglas Parkhill's 1966 book, The Challenge of the Computer Utility. Other scholars have shown that cloud computing's roots go all the way back to the 1950s when scientist Herb Grosch (the author of Grosch's law) postulated that the entire world would operate on dumb terminals powered by about 15 large data centers.[25]

The actual term "cloud" borrows from telephony in that telecommunications companies, who until the 1990s offered primarily dedicated point-to-point data circuits, began offering Virtual Private Network (VPN) services with comparable quality of service but at a much lower cost. By switching traffic to balance utilisation as they saw fit, they were able to utilise their overall network bandwidth more effectively. The cloud symbol was used to denote the demarcation point between that which was the responsibility of the provider and that which was the responsibility of the user. Cloud computing extends this boundary to cover servers as well as the network infrastructure.[26]

After the dot-com bubble, Amazon played a key role in the development of cloud computing by modernising their data centers, which, like most computer networks, were using as little as 10% of their capacity at any one time, just to leave room for occasional spikes. Having found that the new cloud architecture resulted in significant internal efficiency improvements whereby small, fast-moving "two-pizza teams" could add new features faster and more easily, Amazon initiated a new product development effort to provide cloud computing to external customers, and launched Amazon Web Service (AWS) on a utility computing basis in 2006.[17][27]

In early 2008, Eucalyptus became the first open-source, AWS API-compatible platform for deploying private clouds. In early 2008, OpenNebula, enhanced in the RESERVOIR European Commission-funded project, became the first open-source software for deploying private and hybrid clouds, and for the federation of clouds.[28] In the same year, efforts were focused on providing QoS guarantees (as required by real-time interactive applications) to cloud-based infrastructures, in the framework of the IRMOS European Commission-funded project, resulting to a real-time cloud environment.[29] By mid-2008, Gartner saw an opportunity for cloud computing "to shape the relationship among consumers of IT services, those who use IT services and those who sell them"[30] and observed that "[o]rganisations are switching from company-owned hardware and software assets to per-use service-based models" so that the "projected shift to cloud computing ... will result in dramatic growth in IT products in some areas and significant reductions in other areas."[31]

Layers[edit]

Once an internet protocol connection is established among several computers, it is possible to share services within any one of the following layers.

Cloud Computing Stack.svg

Client[edit]

A cloud client consists of computer hardware and/or computer software that relies on cloud computing for application delivery and that is in essence useless without it. Examples include some computers (example: Chromebooks), phones (example: Google Nexus series) and other devices, operating systems (example: Google Chrome OS), and browsers.[32][33][34]

Application[edit]

Cloud application services or "Software as a Service (SaaS)" deliver software as a service over the Internet, eliminating the need to install and run the application on the customer's own computers and simplifying maintenance and support.

A cloud application is software provided as a service. It consists of the following: a package of interrelated tasks, the definition of these tasks, and the configuration files, which contain dynamic information about tasks at run-time. Cloud tasks provide compute, storage, communication and management capabilities. Tasks can be cloned into multiple virtual machines, and are accessible through application programmable interfaces (API). Cloud applications are a kind of utility computing that can scale out and in to match the workload demand. Cloud applications have a pricing model that is based on different compute and storage usage, and tenancy metrics [35].

What makes a cloud application different from other applications is its elasticity. Cloud applications have the ability to scale out and in. This can be achieved by cloning tasks in to multiple virtual machines at run-time to meet the changing work demand. Configuration Data is where dynamic aspects of cloud application are determined at run-time. There is no need to stop the running application or redeploy it in order to modify or change the information in this file [36].

SOA is an umbrella that describes any kind of service. A cloud application is a service. A cloud application meta-model is a SOA model that conforms to the SOA meta-model. This makes cloud applications SOA applications. However, SOA applications are not necessary cloud applications. A cloud application is a SOA application that runs under a specific environment, which is the cloud computing environment (platform). This environment is characterized by horizontal scalability, rapid provisioning, ease of access, and flexible prices. While SOA is a business model that addresses the business process management, cloud architecture addresses many technical details that are environment specific, which makes it more a technical model [35].

Platform[edit]

Cloud platform services, also known as platform as a service (PaaS), deliver a computing platform and/or solution stack as a service, often consuming cloud infrastructure and sustaining cloud applications.[37] It facilitates deployment of applications without the cost and complexity of buying and managing the underlying hardware and software layers.[38][39] Cloud computing is becoming a major change in our industry, and one of the most important parts of this change is the shift of cloud platforms. Platforms let developers write certain applications that can run in the cloud, or even use services provided by the cloud. There are different names being used for platforms which can include the on-demand platform, or Cloud 9. Regardless of the nomenclature, they all have great potential in developing, and when development teams create applications for the cloud, each must build its own cloud platform.

Infrastructure[edit]

Cloud infrastructure services, also known as "infrastructure as a service" (IaaS), deliver computer infrastructure – typically a platform virtualization environment – as a service, along with raw (block) storage and networking. Rather than purchasing servers, software, data-center space or network equipment, clients instead buy those resources as a fully outsourced service. Suppliers typically bill such services on a utility computing basis; the amount of resources consumed (and therefore the cost) will typically reflect the level of activity.[40]

Server[edit]

The servers layer consists of computer hardware and/or computer software products that are specifically designed for the delivery of cloud services, including multi-core processors, cloud-specific operating systems and combined offerings.[41][42][43][44]

Deployment models[edit]

Cloud computing types

Public cloud[edit]

A public cloud is one based on the standard cloud computing model, in which a service provider makes resources, such as applications and storage, available to the general public over the Internet. Public cloud services may be free or offered on a pay-per-usage model..[16] Cloud computing is no clandestine thing nowadays, companies make its best use in running their business in a virtual world and saving considerably on the infrastructure that would have cost them drearer otherwise.

This type of cloud computing model where service providers make their computing resources available online for the general public. It allows the users to access various important resources on cloud, such as: software, applications or; stored data. On of the prime benefits of using public cloud is that the users are emancipated from performing certain important tasks on their computing machines that they cannot get away with otherwise, these include: installation of resources, their configuration; and storage

The services concerning public cloud computing can be leveraged: Either free of cost; or On pay-per-usage model.

Community cloud[edit]

Community cloud shares infrastructure between several organizations from a specific community with common concerns (security, compliance, jurisdiction, etc.), whether managed internally or by a third-party and hosted internally or externally. The costs are spread over fewer users than a public cloud (but more than a private cloud), so only some of the benefits of cloud computing are realized.[45]

Hybrid cloud[edit]

Hybrid cloud is a composition of two or more clouds (private, community, or public) that remain unique entities but are bound together, offering the benefits of multiple deployment models. It can also be defined as multiple cloud systems that are connected in a way that allows programs and data to be moved easily from one deployment system to another.[45] What is that first thing that come to our minds when we talk about Hybrid Cloud architecture? Is it some kind of cloud form, which is better than public or private ones or is it a combination of both? Did you just choose the second option? If yes, you have managed to get a gist of it. A hybrid cloud is a combination of both public and private clouds. In a simple form, Hybrid cloud functions if a public cloud vendor associates with a private cloud platform or vice-a-verse.

Private cloud[edit]

Private cloud is infrastructure operated solely for a single organization, whether managed internally or by a third-party and hosted internally or externally.[45]

They have attracted criticism because users "still have to buy, build, and manage them" and thus do not benefit from less hands-on management,[46] essentially "[lacking] the economic model that makes cloud computing such an intriguing concept".[47][48]

Architecture[edit]

Cloud computing sample architecture

Cloud architecture,[49] the systems architecture of the software systems involved in the delivery of cloud computing, typically involves multiple cloud components communicating with each other over a loose coupling mechanism such as a messaging queue.

The Intercloud[edit]

The Intercloud[50] is an interconnected global "cloud of clouds"[51][52] and an extension of the Internet "network of networks" on which it is based.[53][54][55]

Cloud engineering[edit]

Cloud engineering is the application of engineering disciplines to cloud computing. It brings a systematic approach to the high level concerns of commercialisation, standardisation, and governance in conceiving, developing, operating and maintaining cloud computing systems. It is a multidisciplinary method encompassing contributions from diverse areas such as systems, software, |web, performance, information, security, platform, risk, and quality engineering.

Issues[edit]

Privacy[edit]

The cloud model has been criticised by privacy advocates for the greater ease in which the companies hosting the cloud services control, thus, can monitor at will, lawfully or unlawfully, the communication and data stored between the user and the host company. Instances such as the secret NSA program, working with AT&T, and Verizon, which recorded over 10 million phone calls between American citizens, causes uncertainty among privacy advocates, and the greater powers it gives to telecommunication companies to monitor user activity.[56] While there have been efforts (such as US-EU Safe Harbor) to "harmonise" the legal environment, providers such as Amazon still cater to major markets (typically the United States and the European Union) by deploying local infrastructure and allowing customers to select "availability zones."[57] Cloud computing poses privacy concerns basically, because the service provider at any point in time, may access the data that is on the cloud. They could accidentally or deliberately alter or even delete some info.[58]

Compliance[edit]

In order to obtain compliance with regulations including FISMA, HIPAA, and SOX in the United States, the Data Protection Directive in the EU and the credit card industry's PCI DSS, users may have to adopt community or hybrid deployment modes that are typically more expensive and may offer restricted benefits. This is how Google is able to "manage and meet additional government policy requirements beyond FISMA"[59][60] and Rackspace Cloud or QubeSpace are able to claim PCI compliance.[61]

Many providers also obtain SAS 70 Type II certification, but this has been criticised on the grounds that the hand-picked set of goals and standards determined by the auditor and the auditee are often not disclosed and can vary widely.[62] Providers typically make this information available on request, under non-disclosure agreement.[63][64]

Customers in the EU contracting with cloud providers established outside the EU/EEA have to adhere to the EU regulations on export of personal data.[65]

Legal[edit]

As can be expected with any revolutionary change in the landscape of global computing, certain legal issues arise; everything from trademark infringement, security concerns to the sharing of propriety data resources.

Open source[edit]

Open-source software has provided the foundation for many cloud computing implementations, one prominent example being the Hadoop framework.[66] In November 2007, the Free Software Foundation released the Affero General Public License, a version of GPLv3 intended to close a perceived legal loophole associated with free software designed to be run over a network.[67]

Open standards[edit]

Most cloud providers expose APIs that are typically well-documented (often under a Creative Commons license[68]) but also unique to their implementation and thus not interoperable. Some vendors have adopted others' APIs and there are a number of open standards under development, with a view to delivering interoperability and portability.[69]

Security[edit]

As cloud computing is achieving increased popularity, concerns are being voiced about the security issues introduced through adoption of this new model. The effectiveness and efficiency of traditional protection mechanisms are being reconsidered as the characteristics of this innovative deployment model differ widely from those of traditional architectures.[70]

The relative security of cloud computing services is a contentious issue that may be delaying its adoption.[71] Issues barring the adoption of cloud computing are due in large part to the private and public sectors' unease surrounding the external management of security-based services. It is the very nature of cloud computing-based services, private or public, that promote external management of provided services. This delivers great incentive to cloud computing service providers to prioritize building and maintaining strong management of secure services.[72] Security issues have been categorised into sensitive data access, data segregation, privacy, bug exploitation, recovery, accountability, malicious insiders, management console security, account control, and multi-tenancy issues. Solutions to various cloud security issues vary, from cryptography, particularly public key infrastructure (PKI), to use of multiple cloud providers, standardisation of APIs, and improving virtual machine support and legal support.[70][73][74]

Sustainability[edit]

Although cloud computing is often assumed to be a form of "green computing", there is as of yet no published study to substantiate this assumption.[75] Siting the servers affects the environmental effects of cloud computing. In areas where climate favors natural cooling and renewable electricity is readily available, the environmental effects will be more moderate. (The same holds true for "traditional" data centers.) Thus countries with favorable conditions, such as Finland,[76] Sweden and Switzerland,[77] are trying to attract cloud computing data centers. Energy efficiency in cloud computing can result from energy-aware scheduling and server consolidation.[78] However, in the case of distributed clouds over data centers with different source of energies including renewable source of energies, a small compromise on energy consumption reduction could result in high carbon footprint reduction.[79]

Abuse[edit]

As with privately purchased hardware, crackers posing as legitimate customers can purchase the services of cloud computing for nefarious purposes. This includes password cracking and launching attacks using the purchased services.[80] In 2009, a banking trojan illegally used the popular Amazon service as a command and control channel that issued software updates and malicious instructions to PCs that were infected by the malware.[81]

Research[edit]

Many universities, vendors and government organisations are investing in research around the topic of cloud computing:[82][83]

  • In October 2007 the Academic Cloud Computing Initiative (ACCI) was announced as a multi-university project designed to enhance students' technical knowledge to address the challenges of cloud computing.[84]
  • In April 2009 the St Andrews Cloud Computing Co-laboratory was launched, focusing on research in the important new area of cloud computing. Unique in the UK, StACC aims to become an international centre of excellence for research and teaching in cloud computing and will provide advice and information to businesses interested in using cloud-based services
  • In December 2010, the TrustCloud research project [85][86] was started by HP Labs Singapore to address transparency and accountability of cloud computing via detective, data-centric approaches[87] encapsulated in a five-layer TrustCloud Framework. The team identified the need for monitoring data life cycles and transfers in the cloud [88], leading to the tackling of key cloud computing security issues such as cloud data leakages, cloud accountability and cross-national data transfers in transnational clouds.
  • In July 2011 the High Performance Computing Cloud (HPCCLoud) project was kicked-off aiming at finding out the possibilities of enhancing performance on cloud environments while running the scientific applications - development of HPCCLoud Performance Analysis Toolkit which was funded by CIM-Returning Experts Programme - under the coordination of Prof. Dr. Shajulin Benedict.
  • In June 2011 the Telecommunications Industry Association developed a Cloud Computing White Paper, to analyze the integration challenges and opportunities between cloud services and traditional U.S. telecommunications standards.[89]

References[edit]

  1. "Gartner Says Cloud Computing Will Be As Influential As E-business". Gartner.com. http://www.gartner.com/it/page.jsp?id=707508. Retrieved 2010-08-22. 
  2. Gruman, Galen (2008-04-07). "What cloud computing really means". InfoWorld. http://www.infoworld.com/d/cloud-computing/what-cloud-computing-really-means-031. Retrieved 2009-06-02. 
  3. "Cloud Computing: Clash of the clouds". The Economist. 2009-10-15. http://www.economist.com/displaystory.cfm?story_id=14637206. Retrieved 2009-11-03. 
  4. Cloud Computing Defined 17 July 2010. Retrieved 26 July 2010.
  5. "Kerravala, Zeus, Yankee Group, "Migrating to the cloud is dependent on a converged infrastructure," Tech Target". Convergedinfrastructure.com. http://www.convergedinfrastructure.com/Path-to-the-Cloud/. Retrieved 2011-12-02. 
  6. "Baburajan, Rajani, "The Rising Cloud Storage Market Opportunity Strengthens Vendors," infoTECH, August 24, 2011". It.tmcnet.com. 2011-08-24. http://it.tmcnet.com/channels/cloud-storage/articles/211183-rising-cloud-storage-market-opportunity-strengthens-vendors.htm. Retrieved 2011-12-02. 
  7. "Oestreich, Ken, "Converged Infrastructure," CTO Forum, November 15, 2010". Thectoforum.com. 2010-11-15. http://www.thectoforum.com/content/converged-infrastructure-0. Retrieved 2011-12-02. 
  8. Buyya, Rajkumar; Chee Shin Yeo, Srikumar Venugopal (PDF). Market-Oriented Cloud Computing: Vision, Hype, and Reality for Delivering IT Services as Computing Utilities. Department of Computer Science and Software Engineering, University of Melbourne, Australia. p. 9. http://www.gridBus.org/~raj/papers/hpcc2008_keynote_cloudcomputing.pdf. Retrieved 2008-07-31. 
  9. Lillington, Karlin. "Getting clear about cloud computing". The Irish Times. http://www.irishtimes.com/business-services/cloud-computing-ireland/getting-clear-about-cloud-computing/. 
  10. Thomas J. Kwasniewski, EJ Puig, "Cloud Computing in the Government", Data & Analysis Centre for Software, July 2011
  11. "What's In A Name? Utility vs. Cloud vs Grid". Datacenterknowledge.com. http://www.datacenterknowledge.com/archives/2008/Mar/25/whats_in_a_name_utility_vs_cloud_vs_grid.html. Retrieved 2010-08-22. 
  12. "Distributed Application Architecture". Sun Microsystem. Archived from the original on 2003-12-16. http://web.archive.org/web/20031216124424/http://java.sun.com/developer/Books/jdbc/ch07.pdf. Retrieved 2009-06-16. 
  13. "It's probable that you've misunderstood 'Cloud Computing' until now". TechPluto. http://portal.acm.org/citation.cfm?id=1496091.1496100&coll=&dl=ACM&CFID=21518680&CFTOKEN=18800807. Retrieved 2010-09-14. 
  14. "Recession Is Good For Cloud Computing – Microsoft Agrees". CloudAve. http://www.cloudave.com/link/recession-is-good-for-cloud-computing-microsoft-agrees. Retrieved 2010-08-22. 
  15. Farber, Dan (2008-06-25). "The new geek chic: Data centers". CNET News. http://news.cnet.com/8301-13953_3-9977049-80.html. Retrieved 2010-08-22. 
  16. a b c Invalid <ref> tag; no text was provided for refs named idc
  17. a b Jeff Bezos' Risky Bet.
  18. King, Rachael (2008-08-04). "Cloud Computing: Small Companies Take Flight". Businessweek. http://www.businessweek.com/technology/content/aug2008/tc2008083_619516.htm. Retrieved 2010-08-22. 
  19. "Defining and Measuring Cloud Elasticity". KIT Software Quality Departement. http://digbib.ubka.uni-karlsruhe.de/volltexte/1000023476. Retrieved 13 August 2011. 
  20. "Economies of Cloud Scale Infrastructure". Cloud Slam 2011. http://www.youtube.com/watch?v=nfDsY3f4nVI. Retrieved 13 May 2011. 
  21. "Encrypted Storage and Key Management for the cloud". Cryptoclarity.com. 2009-07-30. http://www.cryptoclarity.com/CryptoClarityLLC/Welcome/Entries/2009/7/23_Encrypted_Storage_and_Key_Management_for_the_cloud.html. Retrieved 2010-08-22. 
  22. Mills, Elinor (2009-01-27). "Cloud computing security forecast: Clear skies". CNET News. http://news.cnet.com/8301-1009_3-10150569-83.html. Retrieved 2010-08-22. 
  23. "Writing & Speaking". Sellsbrothers.com. http://www.sellsbrothers.com/writing/intro2tapi/default.aspx?content=pstn.htm. Retrieved 2010-08-22. 
  24. "The Internet Cloud". Thestandard.com. http://www.thestandard.com/article/0,1902,5466,00.html. Retrieved 2010-08-22. 
  25. Regulation of the Cloud in India, Ryan, Falvey & Merchant, Journal of Internet Law, Vol 15, No. 4 (October 2011).
  26. "July, 1993 meeting report from the IP over ATM working group of the IETF". http://mirror.switch.ch/ftp/doc/ietf/ipatm/atm-minutes-93jul.txt. Retrieved 2010-08-22. 
  27. [2].
  28. B. Rochwerger, J. Caceres, R.S. Montero, D. Breitgand, E. Elmroth, A. Galis, E. Levy, I.M. Llorente, K. Nagin, Y. Wolfsthal, E. Elmroth, J. Caceres, M. Ben-Yehuda, W. Emmerich, F. Galan. "The RESERVOIR Model and Architecture for Open Federated Cloud Computing", IBM Journal of Research and Development, Vol. 53, No. 4. (2009)
  29. D. Kyriazis, A. Menychtas, G. Kousiouris, K. Oberle, T. Voith, M. Boniface, E. Oliveros, T. Cucinotta, S. Berger, “A Real-time Service Oriented Infrastructure”, International Conference on Real-Time and Embedded Systems (RTES 2010), Singapore, November 2010
  30. Keep an eye on cloud computing, Amy Schurr, Network World, 2008-07-08, citing the Gartner report, "Cloud Computing Confusion Leads to Opportunity". Retrieved 2009-09-11.
  31. Gartner Says Worldwide IT Spending On Pace to Surpass Trillion in 2008, Gartner, 2008-08-18. Retrieved 2009-09-11.
  32. Claburn, Thomas. "Google Reveals Nexus One 'Super Phone'". InformationWeek. http://www.informationweek.com/news/software/web_services/showArticle.jhtml?articleID=222200331. Retrieved 2010-08-22. 
  33. "What Makes a Cloud Computer?". Gigaom.com. 2008-06-22. http://gigaom.com/2008/06/22/what-makes-a-good-cloud-computer/. Retrieved 2010-08-22. 
  34. by Brian BraikerSeptember 02, 2008 (2008-09-02). "The Cloud's Chrome Lining". Newsweek.com. http://www.newsweek.com/id/156911. Retrieved 2010-08-22. 
  35. a b Mohammad Hamdaqa, Tassos Livogiannis, Ladan Tahvildari: A Reference Model for Developing Cloud Applications. CLOSER 2011: 98-103
  36. http://www.stargroup.uwaterloo.ca/~mhamdaqa/publications/A%20REFERENCEMODELFORDEVELOPINGCLOUD%20APPLICATIONS.pdf
  37. "An example of a 'Cloud Platform' for building applications". Eccentex.com. http://www.eccentex.com/platform/workflow.html. Retrieved 2010-08-22. 
  38. Jack Schofield (2008-04-17). "Google angles for business users with 'platform as a service'". London: Guardian. http://www.guardian.co.uk/technology/2008/apr/17/google.software. Retrieved 2010-08-22. 
  39. "The Emerging Cloud Service Architecture". Aws.typepad.com. 2008-06-03. http://aws.typepad.com/aws/2008/06/the-forthcoming.html. Retrieved 2010-08-22. 
  40. "EMC buys Pi and forms a cloud computing group". Searchstorage.techtarget.com. 2008-02-21. http://searchstorage.techtarget.com/news/article/0,289142,sid5_gci1301852,00.html. Retrieved 2010-08-22. 
  41. Nimbus Cloud Guide[dead link]
  42. Myslewski, Rik (2009-12-02). "Intel puts cloud on single megachip". Theregister.co.uk. http://www.theregister.co.uk/2009/12/02/intel_scc/. Retrieved 2010-08-22. 
  43. Duffy, Jim (2009-05-12). "Cisco unveils cloud computing platform for service providers". Infoworld.com. http://www.infoworld.com/d/cloud-computing/cisco-unveils-cloud-computing-platform-service-providers-113. Retrieved 2010-08-22. 
  44. Markoff, John (2008-10-27). "Microsoft Plans 'Cloud' Operating System". Nytimes.com. http://www.nytimes.com/2008/10/28/technology/28soft.html. Retrieved 2011-08-20. 
  45. a b c "The NIST Definition of Cloud Computing". National Institute of Science and Technology. http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf. Retrieved 24 July 2011. 
  46. Foley, John. "Private Clouds Take Shape". InformationWeek. http://www.informationweek.com/news/services/business/showArticle.jhtml?articleID=209904474. Retrieved 2010-08-22. 
  47. Haff, Gordon (2009-01-27). "Just don't call them private clouds". CNET News. http://news.cnet.com/8301-13556_3-10150841-61.html. Retrieved 2010-08-22. 
  48. "There's No Such Thing As A Private Cloud". InformationWeek. 2010-06-30. http://www.informationweek.com/cloud-computing/blog/archives/2009/01/theres_no_such.html. Retrieved 2010-08-22. 
  49. "Building GrepTheWeb in the Cloud, Part 1: Cloud Architectures". Developer.amazonwebservices.com. http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1632&categoryID=100. Retrieved 2010-08-22. 
  50. Bernstein, David; Ludvigson, Erik; Sankar, Krishna; Diamond, Steve; Morrow, Monique (2009-05-24). Blueprint for the Intercloud – Protocols and Formats for Cloud Computing Interoperability. IEEE Computer Society. pp. 328–336. doi:10.1109/ICIW.2009.55. http://www2.computer.org/portal/web/csdl/doi/10.1109/ICIW.2009.55. 
  51. "Kevin Kelly: A Cloudbook for the Cloud". Kk.org. http://www.kk.org/thetechnium/archives/2007/11/a_cloudbook_for.php. Retrieved 2010-08-22. 
  52. "Intercloud is a global cloud of clouds". Samj.net. 2009-06-22. http://samj.net/2009/06/intercloud-is-global-cloud-of-clouds.html. Retrieved 2010-08-22. 
  53. "Vint Cerf: Despite Its Age, The Internet is Still Filled with Problems". Readwriteweb.com. http://www.readwriteweb.com/archives/vint_cerf_despite_its_age_the.php?mtcCampaign=2765. Retrieved 2010-08-22. 
  54. "SP360: Service Provider: From India to Intercloud". Blogs.cisco.com. http://blogs.cisco.com/sp/comments/from_india_to_intercloud/. Retrieved 2010-08-22. 
  55. Canada (2007-11-29). "Head in the clouds? Welcome to the future". Toronto: Theglobeandmail.com. Archived from the original on 2007-12-15. http://web.archive.org/web/20071215230430/http://www.theglobeandmail.com/servlet/story/LAC.20071129.TWLINKS29/TPStory/Business. Retrieved 2010-08-22. 
  56. Cauley, Leslie (2006-05-11). "NSA has massive database of Americans' phone calls". USATODAY.com. http://www.usatoday.com/news/washington/2006-05-10-nsa_x.htm. Retrieved 2010-08-22. 
  57. "Feature Guide: Amazon EC2 Availability Zones". Amazon Web Services. http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1347&categoryID=112. Retrieved 2010-08-22. 
  58. "Cloud Computing Privacy Concerns on Our Doorstep". http://cacm.acm.org/magazines/2011/1/103200-cloud-computing-privacy-concerns-on-our-doorstep/fulltext. 
  59. "FISMA compliance for federal cloud computing on the horizon in 2010". SearchCompliance.com. http://searchcompliance.techtarget.com/news/article/0,289142,sid195_gci1377298,00.html. Retrieved 2010-08-22. 
  60. "Google Apps and Government". Official Google Enterprise Blog. 2009-09-15. http://googleenterprise.blogspot.com/2009/09/google-apps-and-government.html. Retrieved 2010-08-22. 
  61. "Cloud Hosting is Secure for Take-off: Mosso Enables The Spreadsheet Store, an Online Merchant, to become PCI Compliant". Rackspace. 2009-03-14. http://www.rackspace.com/cloud/blog/2009/03/05/cloud-hosting-is-secure-for-take-off-mosso-enables-the-spreadsheet-store-an-online-merchant-to-become-pci-compliant/. Retrieved 2010-08-22. 
  62. "Amazon gets SAS 70 Type II audit stamp, but analysts not satisfied". SearchCloudComputing.com. 2009-11-17. http://searchcloudcomputing.techtarget.com/news/article/0,289142,sid201_gci1374629,00.html. Retrieved 2010-08-22. 
  63. "Assessing Cloud Computing Agreements and Controls". WTN News. http://wistechnology.com/articles/6954/. Retrieved 2010-08-22. 
  64. "Cloud Certification From Compliance Mandate to Competitive Differentiator". Cloudcor. http://www.youtube.com/watch?v=wYiFdnZAlNQ. Retrieved 2011-09-20. 
  65. "How the New EU Rules on Data Export Affect Companies in and Outside the EU | Dr. Thomas Helbing – Kanzlei für Datenschutz-, Online- und IT-Recht". Dr. Thomas Helbing. http://www.thomashelbing.com/en/how-new-eu-rules-data-export-affect-companies-and-outside-eu. Retrieved 2010-08-22. 
  66. "Open source fuels growth of cloud computing, software-as-a-service". Network World. http://www.networkworld.com/news/2008/072808-open-source-cloud-computing.html. Retrieved 2010-08-22. 
  67. "AGPL: Open Source Licensing in a Networked Age". Redmonk.com. 2009-04-15. http://redmonk.com/sogrady/2009/04/15/open-source-licensing-in-a-networked-age/. Retrieved 2010-08-22. 
  68. GoGrid Moves API Specification to Creative Commons[dead link]
  69. "Eucalyptus Completes Amazon Web Services Specs with Latest Release". Ostatic.com. http://ostatic.com/blog/eucalyptus-completes-amazon-web-services-specs-with-latest-release. Retrieved 2010-08-22. 
  70. a b Zissis, Dimitrios; Lekkas (2010). "Addressing cloud computing security issues". Future Generation Computer Systems. doi:10.1016/j.future.2010.12.006. http://www.sciencedirect.com/science/article/pii/S0167739X10002554. 
  71. "Are security issues delaying adoption of cloud computing?". Network World. http://www.networkworld.com/news/2009/042709-burning-security-cloud-computing.html. Retrieved 2010-08-22. 
  72. "Security of virtualization, cloud computing divides IT and security pros". Network World. 2010-02-22. http://www.networkworld.com/news/2010/022210-virtualization-cloud-security-debate.html. Retrieved 2010-08-22. 
  73. Armbrust, M; Fox, A., Griffith, R., Joseph, A., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Zaharia, (2010). "A view of cloud computing.". Communication of the ACM 53 (4): 50–58. doi:10.1145/1721654.1721672. 
  74. Anthens, G. "Security in the cloud". Communications of the ACM 53 (11). doi:10.1145/1839676.1839683. 
  75. James Urquhart (January 7, 2010). "Cloud computing's green paradox". CNET News. http://news.cnet.com/8301-19413_3-10428065-240.html. Retrieved March 12, 2010. "...there is some significant evidence that the cloud is encouraging more compute consumption" 
  76. Finland – First Choice for Siting Your Cloud Computing Data Center.. Retrieved 4 August 2010.
  77. Swiss Carbon-Neutral Servers Hit the Cloud.. Retrieved 4 August 2010.
  78. Berl, Andreas, et al., Energy-Efficient Cloud Computing, The Computer Journal, 2010.
  79. Farrahi Moghaddam, Fereydoun, et al., Low Carbon Virtual Private Clouds, IEEE Cloud 2011.
  80. Alpeyev, Pavel (2011-05-14). "Amazon.com Server Said to Have Been Used in Sony Attack". Bloomberg. http://www.bloomberg.com/news/2011-05-13/sony-network-said-to-have-been-invaded-by-hackers-using-amazon-com-server.html. Retrieved 2011-08-20. 
  81. http://www.theregister.co.uk/2011/05/14/playstation_network_attack_from_amazon/
  82. "Cloud Net Directory. Retrieved 2010-03-01". Cloudbook.net. http://www.cloudbook.net/directories/research-clouds. Retrieved 2010-08-22. 
  83. "– National Science Foundation (NSF) News – National Science Foundation Awards Millions to Fourteen Universities for Cloud Computing Research – US National Science Foun". Nsf.gov. http://www.nsf.gov/news/news_summ.jsp?cntn_id=114686. Retrieved 2011-08-20. 
  84. Rich Miller (2008-05-02). "IBM, Google Team on an Enterprise Cloud". DataCenterKnowledge.com. http://www.datacenterknowledge.com/archives/2008/05/02/ibm-google-team-on-an-enterprise-cloud/. Retrieved 2010-08-22. 
  85. Ko, Ryan K. L.; Jagadpramana, Peter; Lee, Bu Sung (2011). "Flogger: A File-centric Logger for Monitoring File Access and Transfers within Cloud Computing Environments". Proceedings of the 10th IEEE International Conference on Trust, Security and Privacy of Computing and Communications (TrustCom-11). http://www.hpl.hp.com/techreports/2011/HPL-2011-119.pdf. 
  86. Ko, Ryan K. L.; Jagadpramana, Peter; Mowbray, Miranda; Pearson, Siani; Kirchberg, Markus; Liang, Qianhui; Lee, Bu Sung (2011). "TrustCloud: A Framework for Accountability and Trust in Cloud Computing". Proceedings of the 2nd IEEE Cloud Forum for Practitioners (IEEE ICFP 2011), Washington DC, USA, July 7-8, 2011. http://www.hpl.hp.com/techreports/2011/HPL-2011-38.pdf. 
  87. Ko, Ryan K. L. Ko; Kirchberg, Markus; Lee, Bu Sung (2011). "From System-Centric Logging to Data-Centric Logging - Accountability, Trust and Security in Cloud Computing". Proceedings of the 1st Defence, Science and Research Conference 2011 - Symposium on Cyber Terrorism, IEEE Computer Society, 3-4 August 2011, Singapore. http://www.hpl.hp.com/people/ryan_ko/RKo-DSR2011-Data_Centric_Logging.pdf. 
  88. Ko, Ryan K. L.; Jagadpramana, Peter; Lee, Bu Sung (2011). "Flogger: A File-centric Logger for Monitoring File Access and Transfers within Cloud Computing Environments". Proceedings of the 10th IEEE International Conference on Trust, Security and Privacy of Computing and Communications (TrustCom-11). http://www.hpl.hp.com/techreports/2011/HPL-2011-119.pdf. 
  89. "Publication Download". Tiaonline.org. http://www.tiaonline.org/market_intelligence/publication_download.cfm?file=TIA_Cloud_Computing_White_Paper. Retrieved 2011-12-02. 

External links[edit]

Hard drives store information in binary form and so are considered a type of phyiscal digital media

Digital media is a form of electronic media where data is stored in digital (as opposed to analog) form. It can refer to the technical aspect of storage and transmission (e.g. hard disk drives or computer networking) of information or to the "end product", such as digital video, augmented reality or digital art.

Florida's digital media industry association, Digital Media Alliance Florida, defines digital media as "the creative convergence of digital arts, science, technology and business for human expression, communication, social interaction and education".

There is a rich history of non-binary digital media and computers.

Examples[edit]

The following list of digital media is based on a rather technical view of the term media. Other views might lead to different lists.

Data conversion[edit]

Main page: Analog-to-digital converter

The transformation of an analog signal to digital information via an analog-to-digital converter is called sampling. Most digital media are based on translating analog data into digital data and vice-versa (see digital recording, digital video, television versus digital television).

Data processing[edit]

Main page: Digital signal processing

Once digitized, media may be processed in a variety of ways using standard computer hardware and software or, where performance is critical, in high-performance digital hardware such as an ASIC. Processing can include editing, filtering and content creation.

Post-Network Era[edit]

This is the fast pace era through the use of technology. Home recording of television increasingly advanced this medium. Digital programming can be downloaded instantly. This technology enables broadcasters to use the digital technologies available today to create numerous of channels. Service providers offer on-demand that gives people the opportunity to have the power of when and where they watch or hear their media. Digital technology has converged television and computers into one single medium.[1]

Art[edit]

Main page: Digital art
File:Wiki.Picture by Drawing Machine 2.jpg
Picture produced by Drawing Machine 2

Digital art is any art in which computers played a role in production or display of the artwork. Such art can be an image, sound, animation, video, CD-ROM, DVD-ROM, videogame, web site, algorithm, performance or gallery installation. Many traditional disciplines are now integrating digital technologies and, as a result, the lines between traditional works of art and new media works created using computers has been blurred. For instance, an artist may combine traditional painting with algorithm art and other digital techniques. Often, the medium itself is considered the artwork. As a result, defining computer art by its end product can thus be difficult. Nevertheless, this type of art is beginning to appear in art museum exhibits.

Comic book artists in the past would generally sketch a drawing in pencil before going over the drawing again with India ink, using pens and brushes. Magazine illustrators often worked with India ink, acrylics or oils. Currently, an increasing number of artists are now creating digital artwork.

Digital artists do, simply, what centuries of artists have always done by exploring and adopting a culture's new technology toward the making of a personal imagery. In doing so the culture is also reflected in the artwork as is the artist's personal vision. As our culture becomes increasingly digitized, digital artists are leading the way in exploring and defining this new culture. Digital Artists use a medium that is nearly immaterial, that being binary information which describes the color and brightness of each individual pixel on a computer screen. Taken as a whole an image consisting of pure light is the feedback devise that tells an artist what is being made and simultaneously stored on the computer's hard drive. Digital Artists employ many types of user interfaces that correspond to the wide variety of brushes, lenses or other tools that traditional artist use to shape their materials. Rather than manipulating digital code directly as math, these electronic brushes and tools allow an artist to translate hand motions, cutting and pasting, and what were formerly chemical dark room techniques into the mathematical changes that effect the arrangement of screen pixels and create a picture.

Digital Art is created and stored in a non-material form on the computer's memory systems and must be made physical, usually in the form of prints on paper or some other form of printmaking substrate. In addition, digital art may be exchanged and appreciated directly on a computer screen in gallery situations or simultaneously in every place on the globe with access to the web. Being immaterial has its advantages and with the advent of high quality digital printing techniques a very traditional long lasting print of this artwork can also be produced and marketed.

Companies[edit]

Several design houses are active in this space, prominent names being:

Companies offering training in Digital Media:

See also[edit]

References[edit]

  1. Lotz, Amanda D. (2007). The Television Will Be Revolutionized. New York and London, NY: New York University Press. pp. 53–54. ISBN 978-0-8147-5219-7. 

Further reading[edit]

External links[edit]

In computing, a hyperlink is a reference to data that the reader can directly follow, or that is followed automatically.[1] A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with hyperlinks. A software system for viewing and creating hypertext is a hypertext system, and to create a hyperlink is to hyperlink (or simply to link). A user following hyperlinks is said to navigate or browse the hypertext.

A hyperlink has an anchor, which is the location within a document from which the hyperlink can be followed; the document containing a hyperlink is known as its source document. For example, in an online reference work such as Wikipedia, many words and terms in the text are hyperlinked to definitions of those terms. Hyperlinks are often used to implement reference mechanisms, such as tables of contents, footnotes, bibliographies, indexes, letters and glossaries.

In some hypertext, hyperlinks can be bidirectional: they can be followed in two directions, so both ends act as anchors and as targets. More complex arrangements exist, such as many-to-many links.

The effect of following a hyperlink may vary with the hypertext system and may sometimes depend on the link itself; for instance, on the World Wide Web, most hyperlinks cause the target document to replace the document being displayed, but some are marked to cause the target document to open in a new window. Another possibility is transclusion, for which the link target is a document fragment that replaces the link anchor within the source document. Not only persons browsing the document follow hyperlinks; they may also be followed automatically by programs. A program that traverses the hypertext, following each hyperlink and gathering all the retrieved documents is known as a Web spider or crawling.

Types of links[edit]

Inline link[edit]

An inline link displays remote content without the need for embedding the content. The remote content may be accessed with or without the user selecting the link. For example, the image above is a document that can be viewed separately, but it is included into this page with an inline link.

An inline link may display a modified version of the content; for instance, instead of an image, a thumbnail, low resolution preview, cropped section, or magnified section may be shown. The full content will then usually be available on demand, as is the case with print publishing software – e.g. with an external link. This allows for smaller file sizes and quicker response to changes when the full linked content is not needed, as is the case when rearranging a page layout.

Anchor[edit]

An anchor hyperlink is a link bound to a portion of a document—generally text, though not necessarily. For instance, it may also be a hot area in an image (image map in HTML), a designated, often irregular part of an image. One way to define it is by a list of coordinates that indicate its boundaries. For example, a political map of Africa may have each country hyperlinked to further information about that country. A separate invisible hot area interface allows for swapping skins or labels within the linked hot areas without repetitive embedding of links in the various skin elements.

Hyperlinks in various technologies[edit]

Hyperlinks in HTML[edit]

Tim Berners-Lee saw the possibility of using hyperlinks to link any information to any other information over the Internet. Hyperlinks were therefore integral to the creation of the World Wide Web. Web pages are written in the hypertext mark-up language HTML.

Links are specified in HTML using the <a> (anchor) elements. To see the HTML used to create a page, most browsers offer a "view page source" option. Included in the HTML code will be an expression in the form symbol "<a" and the reference "href="URL">" marking the start of an anchor, followed by the highlighted text and the "</a>" symbol, which indicates the end of the source anchor. The <a> element can also be used to indicate the target of a link.

Webgraph is a graph, formed from web pages as vertices and hyperlinks, as directed edges.

XLink: hyperlinks in XML[edit]

Main page: XLink

The W3C Recommendation called XLink describes hyperlinks that offer a far greater degree of functionality than those offered in HTML. These extended links can be multidirectional, linking from, within, and between XML documents. It also describes simple links, which are unidirectional and therefore offer no more functionality than hyperlinks in HTML.

Hyperlinks in other document technologies[edit]

Hyperlinks are used in the Gopher protocol, text editors, PDF documents, help systems such as Windows Help, word processing documents, spreadsheets, Apple's HyperCard and many other places.

Hyperlinks in virtual worlds[edit]

Main page: Hyperlinks in virtual worlds

Hyperlinks are being implemented in various 3D virtual world networks, including those which utilize the OpenSimulator[2] and Open Cobalt[3] platforms.

Hyperlinks in wikis[edit]

While wikis may use HML-type hyperlinks, lightweight markup languages of wikis (wiki markup) provide simplified syntax, called wikilinks for linking pages within wiki environments.

The syntax and appearance of wikilinks may vary. Ward Cunningham's original wiki software, the WikiWikiWeb, used CamelCase for this purpose. CamelCase was also used in the early version of Wikipedia and is still used in some wikis, such as JSPWiki, TiddlyWiki, Trac and PMWiki. A common markup is using double square brackets around the term to be wikilinked, for example, the following input: [[wiki software]] — will be converted by wiki software to look like this: wiki software.

Hyperlinks used in wikis are commonly classified as follows:

  • Internal wikilinks or intrawiki links lead to pages within the same wiki website.
  • Interwiki links are simplified markup hyperlinks lead to pages of other wikis.
  • External links lead to other webpages.

If an internal wikilink leads to a page that does not exist, it usually has a distinct visual appearance. For example, in Wikipedia they are commonly displayed in red color, like this, and therefore they are called red links in Wikipedia.[4] Another way is to display a highlighted clickable question mark by the wikilinked term, like this?.

How hyperlinks work in HTML[edit]

A link from one domain to another is said to be outbound from its source anchor and inbound to its target.

The most common destination anchor is a URL used in the World Wide Web. This can refer to a document, e.g. a webpage, or other resource, or to a position in a webpage. The latter is achieved by means of an HTML element with a "name" or "id" attribute at that position of the HTML document. The URL of the position is the URL of the webpage with a fragment identifier — "#attribute name" — appended.

When linking to PDF documents from an HTML page the "attribute name" can be replaced with syntax that references a page number or another element of the PDF, for example, page=[pageNo] – "#page=386".

Link behavior in web browsers[edit]

A web browser usually displays a hyperlink in some distinguishing way, e.g. in a different color, font or style. The behavior and style of links can be specified using the Cascading Style Sheets (CSS) language. In a graphical user interface, the appearance of a pointer may change into a hand motif to indicate a link. In most graphical web browsers, links are displayed in underlined blue text when not cached, but underlined purple text when cached. When the user activates the link (e.g. by clicking on it with the mouse) the browser will display the target of the link. If the target is not an HTML file, depending on the file type and on the browser and its plugins, another program may be activated to open the file. The HTML code contains some or all of the five main characteristics of a link: *link destination ("href" pointing to a URL) *link label *link title *link target *link class or link id It uses the HTML element "a" with the attribute "href" (HREF is an abbreviation for "Hypertext REFerence"[5]) and optionally also the attributes "title", "target", and "class" or "id": :<a href="URL" title="link title" target="link target" class="link class">link label</a> Example: To embed a link into a Page, blogpost, or comment, it may take this form: :<a href="http://example.com/">Example</a> After publishing, the complex link string is reduced to the following for visualization in typical Web browsers: Example This contributes to a clean, easy to read text or document. When the pointer hovers over a link, depending on the browser and/or graphical user interface, some informative text about the link can be shown, popping up, not in a regular window, but in a special hover box, which disappears when the pointer is moved away (sometimes it disappears anyway after a few seconds, and reappears when the pointer is moved away and back). Mozilla Firefox, IE, Opera, and many other web browsers all show the URL. In addition, the URL is commonly shown in the status bar. Normally, a link will open in the current frame or window, but sites that use frames and multiple windows for navigation can add a special "target" attribute to specify where the link will be loaded. If no window exists with that name, a new window will be created with the ID, which can be used to refer to the window later in the browsing session. Some developers may choose to capitalize links, to reflect server side interaction such as forms, or abbreviations. Capitalizing is OKAY. Creation of new windows is probably the most common use of the "target" attribute. In order to prevent accidental reuse of a window, the special window names "_blank" and "_new" are usually available, and will always cause a new window to be created. It is especially common to see this type of link when one large website links to an external page. The intention in that case is to ensure that the person browsing is aware that there is no endorsement of the site being linked to by the site that was linked from. However, the attribute is sometimes overused and can sometimes cause many windows to be created even while browsing a single site. Another special page name is "_top", which causes any frames in the current window to be cleared away so that browsing can continue in the full window.

History of the hyperlink[edit]

The term "hyperlink" was coined in 1965 (or possibly 1964) by Ted Nelson at the start of Project Xanadu. Nelson had been inspired by "As We May Think", a popular essay by Vannevar Bush. In the essay, Bush described a microfilm-based machine (the Memex) in which one could link any two pages of information into a "trail" of related information, and then scroll back and forth among pages in a trail as if they were on a single microfilm reel.

In a series of books and articles published from 1964 through 1980, Nelson transposed Bush's concept of automated cross-referencing into the computer context, made it applicable to specific text strings rather than whole pages, generalized it from a local desk-sized machine to a theoretical worldwide computer network, and advocated the creation of such a network. Meanwhile, working independently, a team led by Douglas Engelbart (with Jeff Rulifson as chief programmer) was the first to implement the hyperlink concept for scrolling within a single document (1966), and soon after for connecting between paragraphs within separate documents (1968), with NLS.

A database program HyperCard was released in 1987 for the Apple Macintosh that allowed hyperlinking between various types of pages within a document.

Legal issues[edit]

While hyperlinking among webpages is an intrinsic feature of the web, some websites object to being linked to from other websites; some have claimed that linking to them is not allowed without permission.

Contentious in particular are deep links, which do not point to a site's home page or other entry point designated by the site owner, but to content elsewhere, allowing the user to bypass the site's own designated flow, and inline links, which incorporate the content in question into the pages of the linking site, making it seem part of the linking site's own content unless an explicit attribution is added.

In certain jurisdictions it is or has been held that hyperlinks are not merely references or citations, but are devices for copying web pages. In the Netherlands, Karin Spaink was initially convicted in this way of copyright infringement by linking, although this ruling was overturned in 2003. The courts that advocate this view see the mere publication of a hyperlink that connects to illegal material to be an illegal act in itself, regardless of whether referencing illegal material is illegal. In 2004, Josephine Ho was acquitted of 'hyperlinks that corrupt traditional values' in Taiwan.[6]

In 2000 British Telecom sued Prodigy, claiming that Prodigy infringed its patent (U.S. Patent 4,873,662) on web hyperlinks. After litigation, a court found for Prodigy, ruling that British Telecom's patent did not cover web hyperlinks.[7]

In United States jurisprudence, there is a distinction between the mere act of linking to someone else's website, and linking to content that is illegal or infringing.[8] Several courts have found that merely linking to someone else's website is not copyright or trademark infringement, regardless of how much someone else might object.[9][10][11] Linking to illegal or infringing content can be sufficiently problematic to give rise to legal liability.[12][13][14][15][16] For a summary of the current status of US copyright law as to hyperlinking, see this discussion.

References[edit]

  1. Merriam-Webster.com, hyperlink
  2. Hypergrid
  3. Creating, Saving, and Loading Spaces
  4. Wikipedia: the missing manual By John Broughton, 2008, ISBN 0596515162, p. 75
  5. Tim Berners-Lee, Making a Server ("HREF" is for "hypertext reference")
  6. The prosecution of Taiwan sexuality researcher and activist Josephine Ho
  7. CNET News.com, Hyperlink patent case fails to click. August 23, 2002.
  8. Cybertelecom:: Legal to Link?[dead link]
  9. Ford Motor Company v. 2600 Enterprises, 177 F.Supp.2d 611 (EDMi December 20, 2001)
  10. American Civil Liberties Union v. Miller, 977 F.Supp. 1228 (ND Ga. 1997)
  11. Ticketmaster Corp. v. Tickets.Com, Inc., No. 99-07654 (CD Calif. March 27, 2000)
  12. Intellectual Reserve v. Utah Lighthouse Ministry, Inc., 75 FSupp2d 1290 (D Utah 1999)
  13. Universal City Studios Inc v Reimerdes, 111 FSupp2d 294 (DCNY 2000)
  14. Comcast of Illinois X LLC v. Hightech Elec. Inc., District Court for the Northern District of Illinois, Decision of July 28, 2004, 03 C 3231
  15. WebTVWire.com, Linking to Infringing Video is probably Illegal in the US. December 10, 2006.
  16. Compare Perfect 10 v. Google, Decision of February 21, 2006, Case No. CV 04-9484 AHM (CD Cal. 2/21/06), CRI 2006, 76–88 No liability for thumbnail links to infringing content

Further reading[edit]

VPN Connectivity overview

A virtual private network (VPN) is a network that uses primarily public telecommunication infrastructure, such as the Internet, to provide remote offices or traveling users access to a central organizational network.

VPNs typically require remote users of the network to be authenticated, and often secure data with encryption technologies to prevent disclosure of private information to unauthorized parties.

VPNs may serve any network functionality that is found on any network, such as sharing of data and access to network resources, printers, databases, websites, etc. A VPN user typically experiences the central network in a manner that is identical to being connected directly to the central network. VPN technology via the public Internet has replaced the need to requisition and maintain expensive dedicated leased-line telecommunication circuits once typical in wide-area network installations.

History[edit]

Virtual Private Network technology reduces costs because it does not need physical leased lines to connect remote users to an Intranet. [1]

VPN systems can be classified by:

  • The protocols used to tunnel the traffic
  • The tunnel's termination point, i.e., customer edge or network provider edge
  • Whether they offer site-to-site or remote access connectivity
  • The levels of security provided
  • The OSI layer they present to the connecting network, such as Layer 2 circuits or Layer 3 network connectivity

Some classification schemes are discussed in the following sections.

Security mechanisms[edit]

Secure VPNs use cryptographic tunneling protocols to provide confidentiality by blocking intercepts and packet sniffing, allowing sender authentication to block identity spoofing, and provide message integrity by preventing message alteration.

Secure VPN protocols include the following:

Authentication[edit]

Tunnel endpoints must authenticate before secure VPN tunnels can be established.

User-created remote access VPNs may use passwords, biometrics, two-factor authentication or other cryptographic methods.

Network-to-network tunnels often use passwords or digital certificates, as they permanently store the key to allow the tunnel to establish automatically and without intervention from the user.

Routing[edit]

Tunneling protocols can be used in a point-to-point topology that would theoretically not be considered a VPN, because a VPN by definition is expected to support arbitrary and changing sets of network nodes. But since most router implementations support a software-defined tunnel interface, customer-provisioned VPNs often are simply defined tunnels running conventional routing protocols.

PPVPN Building blocks[edit]

Depending on whether the PPVPN (Provider Provisioned VPN) runs in layer 2 or layer 3, the building blocks described below may be L2 only, L3 only, or combine them both. Multiprotocol Label Switching (MPLS) functionality blurs the L2-L3 identity.

RFC 4026 generalized the following terms to cover L2 and L3 VPNs, but they were introduced in RFC 2547.[6]

Customer edge device. (CE)

A device at the customer premises, that provides access to the PPVPN. Sometimes it's just a demarcation point between provider and customer responsibility. Other providers allow customers to configure it.

Provider edge device (PE)

A PE is a device, or set of devices, at the edge of the provider network, that presents the provider's view of the customer site. PEs are aware of the VPNs that connect through them, and maintain VPN state.

Provider device (P)

A P device operates inside the provider's core network, and does not directly interface to any customer endpoint. It might, for example, provide routing for many provider-operated tunnels that belong to different customers' PPVPNs. While the P device is a key part of implementing PPVPNs, it is not itself VPN-aware and does not maintain VPN state. Its principal role is allowing the service provider to scale its PPVPN offerings, as, for example, by acting as an aggregation point for multiple PEs. P-to-P connections, in such a role, often are high-capacity optical links between major locations of provider.

User-visible PPVPN services This section deals with the types of VPN considered in the IETF; some historical names were replaced by these terms.

OSI Layer 1 services[edit]

Virtual private wire and private line services (VPWS and VPLS)[edit]

In both of these services, the service provider does not offer a full routed or bridged network, but provides components to build customer-administered networks. VPWS are point-to-point while VPLS can be point-to-multipoint. They can be Layer 1 emulated circuits with no data link .

The customer determines the overall customer VPN service, which also can involve routing, bridging, or host network elements.

An unfortunate acronym confusion can occur between Virtual Private Line Service and Virtual Private LAN Service; the context should make it clear whether "VPLS" means the layer 1 virtual private line or the layer 2 virtual private LAN.

OSI Layer 2 services[edit]

Virtual LAN

A Layer 2 technique that allows for the coexistence of multiple LAN broadcast domains, interconnected via trunks using the IEEE 802.1Q trunking protocol. Other trunking protocols have been used but have become obsolete, including Inter-Switch Link (ISL), IEEE 802.10 (originally a security protocol but a subset was introduced for trunking), and ATM LAN Emulation (LANE).

Virtual private LAN service (VPLS)

Developed by IEEE, VLANs allow multiple tagged LANs to share common trunking. VLANs frequently comprise only customer-owned facilities. Whereas VPLS as described in the above section (OSI Layer 1 services) supports emulation of both point-to-point and point-to-multipoint topologies, the method discussed here extends Layer 2 technologies such as 802.1d and 802.1q LAN trunking to run over transports such as Metro Ethernet.

As used in this context, a VPLS is a Layer 2 PPVPN, rather than a private line, emulating the full functionality of a traditional local area network (LAN). From a user standpoint, a VPLS makes it possible to interconnect several LAN segments over a packet-switched, or optical, provider core; a core transparent to the user, making the remote LAN segments behave as one single LAN.[7]

In a VPLS, the provider network emulates a learning bridge, which optionally may include VLAN service.

Pseudo wire (PW)

PW is similar to VPWS, but it can provide different L2 protocols at both ends. Typically, its interface is a WAN protocol such as Asynchronous Transfer Mode or Frame Relay. In contrast, when aiming to provide the appearance of a LAN contiguous between two or more locations, the Virtual Private LAN service or IPLS would be appropriate.

IP-only LAN-like service (IPLS)

A subset of VPLS, the CE devices must have L3 capabilities; the IPLS presents packets rather than frames. It may support IPv4 or IPv6.

OSI Layer 3 PPVPN architectures[edit]

This section discusses the main architectures for PPVPNs, one where the PE disambiguates duplicate addresses in a single routing instance, and the other, virtual router, in which the PE contains a virtual router instance per VPN. The former approach, and its variants, have gained the most attention.

One of the challenges of PPVPNs involves different customers using the same address space, especially the IPv4 private address space.[8] The provider must be able to disambiguate overlapping addresses in the multiple customers' PPVPNs.

BGP/MPLS PPVPN

In the method defined by RFC 2547, BGP extensions advertise routes in the IPv4 VPN address family, which are of the form of 12-byte strings, beginning with an 8-byte Route Distinguisher (RD) and ending with a 4-byte IPv4 address. RDs disambiguate otherwise duplicate addresses in the same PE.

PEs understand the topology of each VPN, which are interconnected with MPLS tunnels, either directly or via P routers. In MPLS terminology, the P routers are Label Switch Routers without awareness of VPNs.

Virtual router PPVPN

The Virtual Router architecture,[9][10] as opposed to BGP/MPLS techniques, requires no modification to existing routing protocols such as BGP. By the provisioning of logically independent routing domains, the customer operating a VPN is completely responsible for the address space. In the various MPLS tunnels, the different PPVPNs are disambiguated by their label, but do not need routing distinguishers.

Virtual router architectures do not need to disambiguate addresses, because rather than a PE router having awareness of all the PPVPNs, the PE contains multiple virtual router instances, which belong to one and only one VPN.

Plaintext tunnels[edit]

Some virtual networks may not use encryption to protect the data contents. While VPNs often provide security, an unencrypted overlay network does not neatly fit within the secure or trusted categorization. For example a tunnel set up between two hosts that used Generic Routing Encapsulation (GRE) would in fact be a virtual private network, but neither secure nor trusted.

Besides the GRE example above, native plaintext tunneling protocols include Layer 2 Tunneling Protocol (L2TP) when it is set up without IPsec and Point-to-Point Tunneling Protocol (PPTP) or Microsoft Point-to-Point Encryption (MPPE).

Trusted delivery networks[edit]

Trusted VPNs do not use cryptographic tunneling, and instead rely on the security of a single provider's network to protect the traffic.

From the security standpoint, VPNs either trust the underlying delivery network, or must enforce security with mechanisms in the VPN itself. Unless the trusted delivery network runs among physically secure sites only, both trusted and secure models need an authentication mechanism for users to gain access to the VPN.

VPNs in mobile environments[edit]

Mobile VPNs are used in a setting where an endpoint of the VPN is not fixed to a single IP address, but instead roams across various networks such as data networks from cellular carriers or between multiple Wi-Fi access points.[14] Mobile VPNs have been widely used in public safety, where they give law enforcement officers access to mission-critical applications, such as computer-assisted dispatch and criminal databases, while they travel between different subnets of a mobile network.[15] They are also used in field service management and by healthcare organizations,[16] among other industries.

Increasingly, mobile VPNs are being adopted by mobile professionals and white-collar workers who need reliable connections.[16] They are used for roaming seamlessly across networks and in and out of wireless-coverage areas without losing application sessions or dropping the secure VPN session. A conventional VPN cannot survive such events because the network tunnel is disrupted, causing applications to disconnect, time out,[14] or fail, or even cause the computing device itself to crash.[16]

Instead of logically tying the endpoint of the network tunnel to the physical IP address, each tunnel is bound to a permanently associated IP address at the device. The mobile VPN software handles the necessary network authentication and maintains the network sessions in a manner transparent to the application and the user.[14] The Host Identity Protocol (HIP), under study by the Internet Engineering Task Force, is designed to support mobility of hosts by separating the role of IP addresses for host identification from their locator functionality in an IP network. With HIP a mobile host maintains its logical connections established via the host identity identifier while associating with different IP addresses when roaming between access networks.

References[edit]

  1. Feilner, Markus. "Chapter 1 - VPN—Virtual Private Network". OpenVPN: Building and Integrating Virtual Private Networks: Learn How to Build Secure VPNs Using this Powerful Open Source Application. Packt Publishing.
  2. Trademark Applications and Registrations Retrieval (TARR)
  3. OpenBSD ssh manual page, VPN section
  4. Unix Toolbox section on SSH VPN
  5. Ubuntu SSH VPN how-to
  6. E. Rosen & Y. Rekhter (March 1999). "RFC 2547 BGP/MPLS VPNs". Internet Engineering Task Forc (IETF). http://www.ietf.org/rfc/rfc2547.txt. 
  7. Ethernet Bridging (OpenVPN), http://openvpn.net/index.php/access-server/howto-openvpn-as/214-how-to-setup-layer-2-ethernet-bridging.html 
  8. Address Allocation for Private Internets, RFC 1918, Y. Rekhter et al.,February 1996
  9. RFC 2917, A Core MPLS IP VPN Architecture
  10. RFC 2918, E. Chen (September 2000)
  11. Layer Two Tunneling Protocol "L2TP", RFC 2661, W. Townsley et al.,August 1999
  12. IP Based Virtual Private Networks, RFC 2341, A. Valencia et al., May 1998
  13. Point-to-Point Tunneling Protocol (PPTP), RFC 2637, K. Hamzeh et al., July 1999
  14. a b c Phifer, Lisa. "Mobile VPN: Closing the Gap", SearchMobileComputing.com, July 16, 2006.
  15. Willett, Andy. "Solving the Computing Challenges of Mobile Officers", www.officer.com, May, 2006.
  16. a b c Cheng, Roger. "Lost Connections", The Wall Street Journal, December 11, 2007.

Further reading[edit]

External links[edit]

Deep Packet Inspection (DPI) (also called complete packet inspection and Information eXtraction - IX -) is a form of computer network packet filtering that examines the data part (and possibly also the header) of a packet as it passes an inspection point, searching for protocol non-compliance, viruses, spam, intrusions or predefined criteria to decide if the packet can pass or if it needs to be routed to a different destination, or for the purpose of collecting statistical information. There are multiple headers for IP packets, network equipment only needs to use the first of these (the IP header) for normal operation, but use of the second header (TCP, UDP etc.) is normally considered to be shallow packet inspection (usually called Stateful Packet Inspection) despite this definition.[1]

Deep Packet Inspection (and filtering) enables advanced network management, user service, and security functions as well as internet data mining, eavesdropping, and censorship. Although DPI technology has been used for Internet management for many years, some advocates of net neutrality fear that the technology can be used anticompetitively or to reduce the openness of the Internet.[2]

DPI is currently being used by the enterprise, service providers and governments in a wide range of applications.[3]

Background[edit]

DPI combines the functionality of an Intrusion Detection System (IDS) and an Intrusion Prevention System (IPS) with a traditional stateful firewall.[4] This combination makes it possible to detect certain attacks that neither the IDS/IPS nor the stateful firewall can catch on their own. Stateful firewalls, while able to see the beginning and end of a packet flow, cannot on their own catch events that would be out of bounds for a particular application. While IDSs are able to detect intrusions, they have very little capability in blocking such an attack. DPIs are used to prevent attacks from viruses and worms at wire speeds. More specifically, DPI can be effective against buffer overflow attacks, Denial of Service (DoS) attacks, sophisticated intrusions, and a small percentage of worms that fit within a single packet.

DPI-enabled devices have the ability to look at Layer 2 and beyond Layer 3 of the OSI model, in cases DPI can be evoked to look through Layer 2-7 of the OSI model. This includes headers and data protocol structures as well as the actual payload of the message. DPI functionality is evoked when a device looks or takes other action based on information beyond Layer 3 of the OSI model. DPI can identify and classify traffic based on a signature database that includes information extracted from the data part of a packet, allowing finer control than classification based only on header information. End points can utilize encryption and obfuscation techniques to evade DPI actions in many cases.

A classified packet can be redirected, marked/tagged (see quality of service), blocked, rate limited, and of course reported to a reporting agent in the network. In this way, HTTP errors of different classifications may be identified and forwarded for analysis. Many DPI devices can identify packet flows (rather than packet-by-packet analysis), allowing control actions based on accumulated flow information.

DPI at the enterprise[edit]

Initially security at the enterprise was just a perimeter discipline, with a dominant philosophy of keeping unauthorized users out, and shielding authorized users from the outside world. The most frequently used tool for accomplishing this has been a stateful firewall. It can permit fine-grained control of access from the outside world to pre-defined destinations on the internal network, as well as permitting access back to other hosts only if a request to the outside world has been made previously.[5]

However, vulnerabilities exist at network layers that are not visible to a stateful firewall. Also, an increase in the use of laptops in the enterprise makes it more difficult to prevent threats such as viruses, worms and spyware from penetrating the corporate network, as many users will connect the laptop to less-secure networks such as home broadband connections or wireless networks in public locations. Firewalls also do not distinguish between permitted and forbidden uses of legitimately-accessed applications. DPI enables IT administrators and security officials to set policies and enforce them at all layers, including the application and user layer to help combat those threats.

Deep Packet Inspection is able to detect a few kinds of buffer overflow attacks.

DPI can be used by the enterprise for Data Leak Prevention (DLP). When an e-mail user tries to send a protected file he may be given information on how to get the proper clearance to send the file.

Template:Examples[6]

DPI at network/Internet service providers[edit]

In addition to using DPI to secure their internal networks, Internet service providers also apply this technology on the public networks provided to customers. Common uses of DPI by ISPs are lawful intercept, policy definition and enforcement, targeted advertising, quality of service, offering tiered services, and copyright enforcement.

Lawful interception[edit]

Service providers are required by almost all governments worldwide to enable lawful intercept capabilities. Decades ago in a legacy telephone environment, this was met by creating a traffic access point (TAP) using an intercepting proxy server that connects to the government's surveillance equipment. This is not possible in contemporary digital networks. The acquisition component of this functionality can be provided in many ways, including DPI, DPI enabled products that are "LI or CALEA-compliant" can be used - when directed by a court order - to access a user's datastream.[7]

Policy definition and enforcement[edit]

Service providers obligated by the service level agreement with their customers to provide a certain level of service, and at the same time enforce an acceptable use policy, may make use of DPI to implement certain policies that cover copyright infringements, illegal materials, and unfair use of bandwidth. In some countries the ISPs are required to perform filtering depending on the country's laws. DPI allows service providers to "readily know the packets of information you are receiving online—from e-mail, to websites, to sharing of music, video and software downloads".[8] Policies can be defined that allow or disallow connection to or from an IP address, certain protocols, or even heuristics that identify a certain application or behavior.

Targeted advertising[edit]

Because ISPs route all of their customers' traffic, they are able to monitor web-browsing habits in a very detailed way allowing them to gain information about their customers' interests, which can be used by companies specializing in targeted advertising. At least 100,000 US customers are tracked this way, and as many of 10% of US customers have been tracked in this way.[citation needed] Technology providers include NebuAd, Front Porch and Phorm. US ISPs monitoring their customers include Knology,[9] and Wide Open West. In addition, the UK ISP British Telecom has admitted testing technology from Phorm without their customers' knowledge or consent.[10]

Quality of service[edit]

Applications such as peer-to-peer (P2P) traffic present increasing problems for broadband service providers. P2P traffic is typically used by applications that do file sharing. This can be documents, music and videos. Due to the frequently large size of media files being transferred, P2P drives increasing traffic loads, requiring additional network capacity. Service providers say a minority of users generate large quantities of P2P traffic and degrade performance for the majority of broadband subscribers using applications such as email or Web browsing which use less bandwidth.[11] Poor network performance increases customer dissatisfaction and leads to a decline in service revenues.

DPI allows the operators to oversell their available bandwidth while ensuring equitable bandwidth distribution to all users by preventing network congestion. Additionally, a higher priority can be allocated to a VoIP or video conferencing call which requires low latency versus web browsing which does not.[12] This is the approach that service providers use to dynamically allocate bandwidth according to traffic that is passing through their networks.

Other Vendors claim that DPI is ineffective against P2P and that other methods of Bandwidth Management are more effective.Template:ClarificationTemplate:Examples[citation needed]

Tiered services[edit]

Mobile and broadband service providers use DPI as a means to implement tiered service plans, to differentiate "walled garden" services from "value added", “all-you-can-eat" and "one-size-fits-all” data services.[13] By being able to charge for a "walled garden", per application, per service, or "all-you-can-eat" rather than a "one-size-fits-all" package, the operator can tailor his offering to the individual subscriber and increase their Average Revenue Per User (ARPU). A policy is created per user or user group, and the DPI system in turn enforces that policy, allowing the user access to different services and applications.

Copyright enforcement[edit]

ISPs are sometimes requested by copyright owners or required by courts or official policy to help enforce copyrights. In 2006, one of Denmark's largest ISPs, Tele2, was given a court injunction and told it must block its customers from accessing The Pirate Bay, a launching point for BitTorrent.[14] Instead of prosecuting file sharers one at a time,[15] the International Federation of the Phonographic Industry (IFPI) and the big four record labels EMI, Sony BMG, Universal Music and Warner Music have begun suing ISPs like Eircom for not doing enough about protecting their copyrights.[16] The IFPI wants ISPs to filter traffic to remove illicitly uploaded and downloaded copyrighted material from their network, despite European directive 2000/31/EC clearly stating that ISPs may not be put under a general obligation to monitor the information they transmit and directive 2002/58/EC granting European citizens a right to privacy of communications. The Motion Picture Association of America (MPAA) which enforces movie copyrights, on the other hand has taken the position with the Federal Communications Commission (FCC) that network neutrality could hurt anti-piracy technology such as Deep Packet Inspection and other forms of filtering.[17]

Statistics[edit]

DPI allows ISPs to gather statistical information about usage patterns by user group. For instance, it might be of interest whether users with a 2 Mbit connection use the network in a dissimilar manner to users with a 5 Mbit connection. Access to trend data also help network planning. Template:ClarificationTemplate:Elluciate<!- also wouldn't shallow packet inspection be sufficient?>

Deep Packet Inspection by governments[edit]

See also: network surveillance and censorship

In addition to using DPI for the security of their own networks, governments in North America, Europe and Asia use DPI for various purposes such as surveillance and censorship; many of these programs are classified.[18]

United States[edit]

FCC adopts Internet CALEA requirements. The FCC, pursuant to its mandate from the US Congress, and in line with the policies of most countries worldwide, has required that all telecommunication providers, including Internet services, be capable of supporting the execution of a court order to provide real-time communication forensics of specified users. In 2006, the FCC adopted new Title 47, Subpart Z, rules requiring Internet Access Providers meet these requirements. DPI was one of the platforms essential to meeting this requirement and has been deployed for this purpose throughout the U.S.

Main page: NSA warrantless surveillance controversy

The National Security Agency (NSA), with cooperation from AT&T has used Deep Packet Inspection technology to make internet traffic surveillance, sorting and forwarding more intelligent. The DPI is used to find which packets are carrying e-mail or a Voice over Internet Protocol (VoIP) phone call.[19] Traffic associated with AT&T’s Common Backbone was "split" between two fibers, dividing the signal so that 50 percent of the signal strength went to each output fiber. One of the output fibers was diverted to a secure room; the other carried communications on to AT&T’s switching equipment. The secure room contained Narus traffic analyzers and logic servers; Narus states that such devices are capable of real-time data collection (recording data for consideration) and capture at 10 gigabits per second. Certain traffic was selected and sent over a dedicated line to a "central location" for analysis. According to Marcus’s affidavit, the diverted traffic "represented all, or substantially all, of AT&T’s peering traffic in the San Francisco Bay area," and thus, "the designers of the ... configuration made no attempt, in terms of location or position of the fiber split, to exclude data sources Template:Sic primarily of domestic data."[20] Narus's Semantic Traffic Analyzer software which runs on IBM or Dell Linux servers, using DPI technology, sorts through IP traffic at 10Gbit/s to pick out specific messages based on a targeted e-mail address, IP address or, in the case of VoIP, phone number.[21] President George W. Bush and Attorney General Alberto R. Gonzales have asserted that they believe the president has the authority to order secret intercepts of telephone and e-mail exchanges between people inside the United States and their contacts abroad without obtaining a FISA warrant.[22]

The Defense Information Systems Agency has developed a sensor platform that uses Deep Packet Inspection.[23]

China[edit]

Main page: Internet censorship in the People's Republic of China

The Chinese government uses Deep Packet Inspection to monitor and censor network traffic and content that it claims harmful to Chinese citizens or state interests. This material includes pornography, information on religion, and political dissent.[24] Chinese network ISPs use DPI to see if there's any sensitive keyword going through their network. If so, the connection will be cut. People within China often find themselves blocked while accessing Web sites containing content related to Taiwanese and Tibetan independence, Falun Gong, the Dalai Lama, the Tiananmen Square protests and massacre of 1989, political parties that oppose that of the ruling Communist party, or a variety of anti-Communist movements[25] as those materials were signed as DPI sensitive keywords already. China also blocks VoIP traffic in and out of their country[citation needed]. Voice traffic in Skype is unaffected, although text messages are subject to DPI, and messages containing sensitive material, such as curse-words, are simply not delivered, with no notification provided to either participant in the conversation. China also blocks visual media sites like YouTube.com, and various photography and blogging sites.[26]

Iran[edit]

Main page: Internet censorship in Iran

The Iranian government purchased a system, reportedly for deep packet inspection, in 2008 from Nokia Siemens Networks (NSN), a joint venture Siemens AG, the German conglomerate, and Nokia Corp., the Finnish cellphone company, according to a report in the Wall Street Journal in June, 2009, quoting NSN spokesperson Ben Roome. According to unnamed experts cited in the article, the system "enables authorities to not only block communication but to monitor it to gather information about individuals, as well as alter it for disinformation purposes."

The system was purchased by the Telecommunication Infrastructure Co., part of the Iranian government's telecom monopoly. According to the Journal, NSN "provided equipment to Iran last year under the internationally recognized concept of 'lawful intercept,' said Mr. Roome. That relates to intercepting data for the purposes of combating terrorism, child pornography, drug trafficking and other criminal activities carried out online, a capability that most if not all telecom companies have, he said.... The monitoring center that Nokia Siemens Networks sold to Iran was described in a company brochure as allowing 'the monitoring and interception of all types of voice and data communication on all networks.' The joint venture exited the business that included the monitoring equipment, what it called 'intelligence solutions,' at the end of March, by selling it to Perusa Partners Fund 1 LP, a Munich-based investment firm, Mr. Roome said. He said the company determined it was no longer part of its core business."

The NSN system followed on purchases by Iran from Secure Computing Corp. earlier in the decade.[27]

Questions have been raised about the reporting reliability of the Journal report by David Isenberg, an independent Washington, D.C.-based analyst and Cato Institute Adjunct Scholar, specifically saying that Mr. Roome is denying the quotes attributed to him and that he, Isenberg, had similar complaints with one of the same Journal reporters himself in an earlier story.[28] NSN has issued the following denial: NSN "has not provided any deep packet inspection, web censorship or Internet filtering capability to Iran."[29] A concurrent article in The New York Times said the NSN sale had been covered in a "spate of news reports in April [2009], including The Washington Times," and reviewed censorship of the Internet and other media in the country, but did not mention DPI.[30]

DPI and net neutrality[edit]

See also: network neutrality

People and organizations concerned about privacy or network neutrality find inspection of the content layers of the Internet protocol to be offensive,[7] saying for example, "the 'Net was built on open access and non-discrimination of packets!"[31] Critics of network neutrality rules, meanwhile, call them "a solution in search of a problem" and say that net neutrality rules would reduce incentives to upgrade networks and launch next-generation network services.[32]

Software[edit]

Opendpi[33] is the open source version for non obfuscated protocols, PACE includes obfuscated/encrypted protocols like Skype or encrypted BitTorrent.[34]

L7-Filter is a classifier for Linux's Netfilter that identifies packets based on application layer data.[35] It can classify packets such as Kazaa, HTTP, Jabber, Citrix, Bittorrent, FTP, Gnucleus, eDonkey2000 etc. It classifies streaming, mailing, P2P, VOIP, protocols and gaming applications.

Hippie(Hi-Performance Protocol Identification Engine) is an open source project which was developed as kernel module.[36] It was developed by Josh Ballard. It supports both DPI as well as firewall functionality. [37]

SPID (Statistical Protocol IDentification) project is based on statistical analysis of network flows to identify application traffic.[38] SPID algorithm can detect the application layer protocol (layer 7) by analysing flow (packet sizes etc.) and payload statistics (byte values etc.) from pcap files. It is just a proof of concept application and currently supports around 15 application/protocols like eDonkey Obfuscation traffic, Skype UDP and TCP, BitTorrent, IMAP, IRC, MSN etc.

Tstat (TCP STatistic and Analysis Tool) provides insight into traffic patterns and gives details statistics for numerous applications and protocols.[39]

The open source community offers a wide array of options for performing deep packet inspection functions; a comprehensive list is maintained by the dPacket.org community [40]

See also[edit]

References[edit]

  1. Dr. Thomas Porter (2005-01-11). "The Perils of Deep Packet Inspection". Security Focus. http://www.securityfocus.com/infocus/1817. Retrieved 2008-03-02. 
  2. Hal Abelson, Ken Ledeen, Chris Lewis (2009). "Just Deliver the Packets, in: "Essays on Deep Packet Inspection", Ottawa". Office of the Privacy Commissioner of Canada. http://dpi.priv.gc.ca/index.php/essays/just-deliver-the-packets/. Retrieved 2010-01-08. 
  3. Ralf Bendrath (2009-03-16). "Global technology trends and national regulation: Explaining Variation in the Governance of Deep Packet Inspection, Paper presented at the International Studies Annual Convention, New York City, 15–18 February 2009". International Studies Association. http://userpage.fu-berlin.de/~bendrath/Paper_Ralf-Bendrath_DPI_v1-5.pdf. Retrieved 2010-01-08. 
  4. Ido Dubrawsky (2003-07-29). "Firewall Evolution - Deep Packet Inspection". Security Focus. http://www.securityfocus.com/infocus/1716. Retrieved 2008-03-02. 
  5. Elan Amir (2007-10-29). "The Case for Deep Packet Inspection". IT Business Edge. http://www.itbusinessedge.com/item/?ci=35275. Retrieved 2008-03-02. 
  6. Michael Morisy (2008-10-23). "Data leak prevention starts with trusting your users". SearchNetworking.com. http://searchnetworking.techtarget.com/news/article/0,289142,sid7_gci1335767,00.html. Retrieved 2010-02-01. 
  7. a b Nate Anderson (2007-07-25). "Deep Packet Inspection meets 'Net neutrality, CALEA". ars technica. http://arstechnica.com/articles/culture/Deep-packet-inspection-meets-net-neutrality.ars. Retrieved 2006-02-06. 
  8. Jeff Chester (2006-02-01). "The End of the Internet?". The Nation. http://www.thenation.com/doc/20060213/chester. Retrieved 2006-02-06. 
  9. "Charter Communications: Enhanced Online Experience". http://connect.charter.com/landing/op1.html. Retrieved 2008-05-14. 
  10. Peter Whoriskey (2008-04-04). "Every Click You Make: Internet Providers Quietly Test Expanded Tracking of Web Use to Target Advertising". The Washington Post. http://www.washingtonpost.com/wp-dyn/content/article/2008/04/03/AR2008040304052.html. Retrieved 2008-04-08. 
  11. "Deep Packet Inspection: Taming the P2P Traffic Beast". Light Reading. http://www.lightreading.com/insider/details.asp?sku_id=1221&skuitem_itemid=957. Retrieved 2008-03-03. 
  12. Matt Hamblen (2007-09-17). "Ball State uses Deep Packet Inspection to ensure videoconferencing performance". Computer World. http://www.computerworld.com/action/article.do?command=viewArticleBasic&taxonomyId=16&articleId=9036959&intsrc=hm_topic. Retrieved 2008-03-03. 
  13. "Allot Deploys DPI Solution at Two Tier 1 Mobile Operators to Deliver Value- Added and Tiered Service Packages". Money Central. 2008-02-05. http://news.moneycentral.msn.com/ticker/article.aspx?Feed=PR&Date=20080205&ID=8139811&Symbol=ALLT. Retrieved 2008-03-03. 
  14. Jeremy Kirk (2008-02-13). "Danish ISP prepares to fight Pirate Bay injunction". IDG News Service. http://www.infoworld.com/article/08/02/13/Danish-ISP-prepares-to-fight-Pirate-Bay-injunction_1.html. Retrieved 2008-03-12. 
  15. Matthew Clark (2005-07-05). "Eircom and BT won't oppose music firms". ENN. http://www.enn.ie/frontpage/news-9617239.html. Retrieved 2008-03-12. [dead link]
  16. Eric Bangeman (2008-03-11). ""Year of filters" turning into year of lawsuits against ISPs". ars technica. http://arstechnica.com/news.ars/post/20080311-year-of-filters-turning-into-year-of-lawsuits-against-isps.html. Retrieved 2008-03-12. 
  17. Anne Broach (2007-07-19). "MPAA: Net neutrality could hurt antipiracy tech". CNET News. http://www.news.com/8301-10784_3-9746938-7.html. Retrieved 2008-03-12. 
  18. Carolyn Duffy Marsan (2007-06-27). "OEM provider Bivio targets government market". Network World. http://www.networkworld.com/newsletters/isp/2007/0625isp1.html. Retrieved 2008-03-13. 
  19. J. I. Nelson, Ph.D. (2006-09-26). "How the NSA warrantless wiretap system works". http://www.nerdylorrin.net/jerry/politics/Warrantless/WarrantlessFACTS.html. Retrieved 2008-03-03. 
  20. Bellovin, Steven M.; Matt Blaze, Whitfield Diffie, Susan Landau, Peter G. Neumann, and Jennifer Rexford (January/February 2008). "Risking Communications Security: Potential Hazards of the Protect America Act". IEEE Security and Privacy (IEEE Computer Society) 6 (1): 24–33. doi:10.1109/MSP.2008.17. http://www.crypto.com/papers/paa-ieee.pdf. Retrieved 2008-03-03. 
  21. Robert Poe (2006-05-17). "The Ultimate Net Monitoring Tool". Wired. http://www.wired.com/science/discoveries/news/2006/05/70914. Retrieved 2008-03-03. 
  22. Carol D. Leonnig (2007-01-07). "Report Rebuts Bush on Spying - Domestic Action's Legality Challenged". The Washington Post. http://www.washingtonpost.com/wp-dyn/content/article/2006/01/06/AR2006010601772.html. Retrieved 2008-03-03. 
  23. Cheryl Gerber (2008-09-18). "Deep Security: DISA Beefs Up Security with Deep Packet Inpection of IP Transmissions". https://www.dpacket.org/articles/deep-security-disa-beefs-security-deep-packet-inpection-ip-transmissions. Retrieved 2008-10-30. 
  24. Ben Elgin and Bruce Einhorn (2006-01-12). "The Great Firewall of China". Business Week. http://www.businessweek.com/technology/content/jan2006/tc20060112_434051.htm. Retrieved 2008-03-13. 
  25. "Internet Filtering in China in 2004-2005: A Country Study". Open Net Initiative. http://www.opennetinitiative.net/studies/china/. Retrieved 2008-03-13. 
  26. "China Blocks YouTube, Restores Flickr and Blogspot". PC World. 2007-10-18. http://www.pcworld.com/article/id,138599-c,sites/article.html. Retrieved 2008-03-03. 
  27. "Iran's Web Spying Aided By Western Technology" by Christopher Rhoads in New York and Loretta Chao in Beijing, The Wall Street Journal, June 22, 2009. Retrieved 6/22/09.
  28. "Questions about WSJ story on Net Management in Iran" by David S. Isenberg, isen.blog, June 23, 2009. Retrieved 6/22/09.
  29. "Provision of Lawful Intercept capability in Iran" Company press release. June 22, 2009. Retrieved 6/22/09.
  30. "Web Pries Lid of Iranian Censorship" by Brian Stelter and Brad Stone, The New York Times, June 22, 2009. Retrieved 6/23/09.
  31. Genny Pershing. "Network Neutrality: Historic Neutrality". Cybertelecom. http://www.cybertelecom.org/ci/neutral.htm#his. Retrieved 2008-06-26. 
  32. Genny Pershing. "Network Neutrality: Insufficient Harm". Cybertelecom. http://www.cybertelecom.org/ci/neutral.htm#ins. Retrieved 2008-06-26. 
  33. Opendpi
  34. Deep packet inspection engine goes open source
  35. L7-Filter home page
  36. Hippie Project download page on SourceForge
  37. Hippie reference page
  38. SPID project on SourceForge
  39. Tstat project home
  40. "Open Source deep packet inspection projects". dPacket.org. https://www.dpacket.org/group-posts/open-source-software-general-discussion/open-source-software-related-deep-packet-inspect. 

External links[edit]


Intellectual Property and the Internet
Intellectual property Copyright Copyright infringement

An Internet Protocol address (IP address) is a numeric label assigned to each device (e.g., computer, smartphone, hotspot) participating in a computer network that uses the Internet Protocol for communication.[1] The address serves two principal functions: host or network interface identification and location addressing. Its role has been characterized as follows: "A name indicates what we seek. An address indicates where it is. A route indicates how to get there."[1]

The designers of the Internet Protocol defined an IP address as an integer expressed in binary notation with a length of 32 bits [1] and this system, known as IPv4, is still in use today. However, due to the enormous growth of the Internet and the predicted exhaustion of available IPv4 addresses, a new addressing system called IPv6, using addresses that are binary integers 128 bits long,[2] has been deployed worldwide alongside IPv4 since the mid-2000s.

Though IP addresses are technically binary numbers, they are usually stored and displayed in more human-readable notations like decimal, such as 172.16.254.1 (for IPv4), and 2001:db8:0:1234:0:567:8:1 (for IPv6).

The Internet Assigned Numbers Authority (IANA) manages the IP address space allocations globally and maintains five regional Internet registries (RIRs) to allocate IP address blocks to Internet service providers and other entities.

IP versions[edit]

Two versions of the Internet Protocol (IP) are in use: IP Version 4 and IP Version 6. Each version defines an IP address differently. Because of its prevalence, the generic term IP address typically still refers to the addresses defined by IPv4. The gap in version sequence between IPv4 and IPv6 resulted from the assignment of number 5 to the experimental Internet Stream Protocol in 1979, though it was never referred to as IPv5.

IPv4 addresses[edit]

Conversion of an IPv4 address from "dot-decimal" notation to its binary value

In IPv4 an address consists of 32 bits which limits the address space to 4,294,967,296 (232) possible unique addresses. IPv4 reserves some addresses for special purposes such as private networks (~18 million addresses) or multicast addresses (~270 million addresses).

IPv4 addresses are canonically represented in "dot-decimal" notation which consists of four decimal numbers, each ranging from 0 to 255, separated by full stops or periods, e.g., 172.16.254.1. Each part represents a group of 8 bits (octet) of the address. In some cases of technical writing, IPv4 addresses may be presented in various hexadecimal, octal, or binary representations.

IPv4 subnetting[edit]

In the early stages of development of the Internet Protocol,[3] network administrators interpreted an IP address in two parts: network number portion and host number portion. The highest order octet (most significant eight bits) in an address was designated as the network number and the remaining bits were called the rest field or host identifier and were used for host numbering within a network.

This early method soon proved inadequate as additional networks developed that were independent of the existing networks already designated by a network number. In 1981, the Internet addressing specification was revised with the introduction of classful network architecture.[1]

Classful network design allowed for a larger number of individual network assignments and fine-grained subnetwork design. The first three bits of the most significant octet of an IP address were defined as the class of the address. Three classes (A, B, and C) were defined for universal unicast addressing. Depending on the class derived, the network identification was based on octet boundary segments of the entire address. Each class used successively additional octets in the network identifier, thus reducing the possible number of hosts in the higher order classes (B and C). The following table gives an overview of this, now obsolete, system.

Classful network architecture
Class Leading bits in address (binary) Range of first octet (decimal) Network ID format Host ID format No. of networks No. of addresses per network
A 0 0 - 127 a b.c.d 27 = 128 224 = 16,777,216
B 10 128–191 a.b c.d 214 = 16,384 216 = 65,536
C 110 192–223 a.b.c d 221 = 2,097,152 28 = 256

Classful network design served its purpose in the startup stage of the Internet, but it lacked scalability in the face of the rapid expansion of the network in the 1990s. The class system of the address space was replaced with Classless Inter-Domain Routing (CIDR) in 1993, which employs variable-length subnet masking (VLSM) to allow allocation and routing based on arbitrary-length prefixes.

Today, remnants of classful network concepts function only in a limited scope as the default configuration parameters of some network software and hardware components (e.g. netmask), and in the technical jargon used in network administrators' discussions.

IPv4 private addresses[edit]

Early network design, when global end-to-end connectivity was envisioned for communications with all Internet hosts, intended that IP addresses be uniquely assigned to a particular computer or device. However, it was found that this was not always necessary as private networks developed and public address space needed to be conserved.

Computers not connected to the Internet, such as factory machines that communicate only with each other via TCP/IP, need not have globally unique IP addresses. Three ranges of IPv4 addresses for private networks were reserved in RFC 1918. These addresses are not routed on the Internet and thus their use need not be coordinated with an IP address registry.

Today, when needed, such private networks typically connect to the Internet through network address translation (NAT).

IANA-reserved private IPv4 network ranges
Start End No. of addresses
24-bit block (/8 prefix, 1 × A) 10.0.0.0 10.255.255.255 16,777,216
20-bit block (/12 prefix, 16 × B) 172.16.0.0 172.31.255.255 1,048,576
16-bit block (/16 prefix, 256 × C) 192.168.0.0 192.168.255.255 65,536

Any user may use any of the reserved blocks. Typically, a network administrator will divide a block into subnets; for example, many home routers automatically use a default address range of 192.168.1.1 through 192.168.1.255 (192.168.1.0/24 in CIDR notation).

IPv4 address exhaustion[edit]

The supply of unallocated IPv4 addresses available at IANA and the RIRs for assignment to end users and Internet service providers has been completely exhausted since February 3, 2011, when the last 5 A-class blocks were allocated to the 5 RIRs.[4] Asia-Pacific Network Information Centre (APNIC) was the first RIR to exhaust its regional pool on April 15, 2011, except for a small amount of address space reserved for the transition to IPv6.[5]

IPv6 addresses[edit]

Conversion of an IPv6 address from hexadecimal representation to its binary value

The rapid exhaustion of IPv4 address space, despite conservation techniques, prompted the Internet Engineering Task Force (IETF) to explore new technologies to expand the Internet's addressing capability. The permanent solution was deemed to be a redesign of the Internet Protocol itself. This next generation of the Internet Protocol, intended to complement and eventually replace IPv4, was ultimately named Internet Protocol Version 6 (IPv6) in 1995.[2] The address size was increased from 32 to 128 bits or 16 octets. This, even with a generous assignment of network blocks, is deemed sufficient for the foreseeable future. Mathematically, the new address space provides the potential for a maximum of ~3.403×1038 unique addresses (2128).

The new design is not intended to provide a sufficient quantity of addresses on its own, but rather to allow efficient aggregation of subnet routing prefixes to occur at routing nodes. As a result, routing table sizes are smaller, and the smallest possible individual allocation is a subnet for 264 hosts, which is the square of the size of the entire IPv4 Internet. At these levels, actual address utilization rates will be small on any IPv6 network segment. The new design also provides the opportunity to separate the addressing infrastructure of a network segment — that is the local administration of the segment's available space — from the addressing prefix used to route external traffic for a network. IPv6 has facilities that automatically change the routing prefix of entire networks, should the global connectivity or the routing policy change, without requiring internal redesign or renumbering.

The large number of IPv6 addresses allows large blocks to be assigned for specific purposes and, where appropriate, to be aggregated for efficient routing. With a large address space, there is not the need to have complex address conservation methods as used in Classless Inter-Domain Routing (CIDR).

Many modern desktop and enterprise server operating systems include native support for the IPv6 protocol, but it is not yet widely deployed in other devices, such as home networking routers, voice over IP (VoIP) equipment, and other network peripherals.

IPv6 private addresses[edit]

Just as IPv4 reserves addresses for private or internal networks, blocks of addresses are set aside in IPv6 for private addresses. In IPv6, these are referred to as unique local addresses (ULAs). RFC 4193 sets aside the routing prefix fc00::/7 for this block which is divided into two /8 blocks with different implied policies The addresses include a 40-bit pseudorandom number that minimizes the risk of address collisions if sites merge or packets are misrouted.[6]

Addresses starting with fe80:, called link-local addresses, are assigned to interfaces for communication within the subnet only. The addresses are automatically generated by the operating system for each network interface. This provides instant and automatic network connectivity for any IPv6 host and means that if several hosts connect to a common hub or switch, they have a communication path via their link-local IPv6 address. This feature is used in the lower layers of IPv6 network administration (e.g. Neighbor Discovery Protocol).

None of the private address prefixes may be routed on the public Internet.

IP subnetworks[edit]

IP networks may be divided into subnetworks in both IPv4 and IPv6. For this purpose, an IP address is logically recognized as consisting of two parts: the network prefix and the host identifier (renamed the interface identifier in IPv6). The subnet mask or the CIDR prefix determines how the IP address is divided into network and host parts.

The term subnet mask is only used within IPv4. Both IP versions, however, use the Classless Inter-Domain Routing (CIDR) concept and notation. In this, the IP address is followed by a slash and the number (in decimal) of bits used for the network part, also called the routing prefix. For example, an IPv4 address and its subnet mask may be 192.0.2.1 and 255.255.255.0, respectively; the CIDR notation for the same IP address and subnet is 192.0.2.1/24, because the first 24 bits of the IP address (192.000.002, precisely) indicate the network and subnet.

IP address assignment[edit]

Internet Protocol addresses are assigned to a host either anew at the time of booting, or permanently by fixed configuration of its hardware or software. Persistent configuration is also known as using a static IP address. In contrast, in situations when the computer's IP address is assigned newly each time, this is known as using a dynamic IP address.

Methods[edit]

Static IP addresses are manually assigned to a computer by an administrator. The exact procedure varies according to platform. This contrasts with dynamic IP addresses, which are assigned either by the computer interface or host software itself, as in Zeroconf, or assigned by a server using Dynamic Host Configuration Protocol (DHCP). Even though IP addresses assigned using DHCP may stay the same for long periods of time, they can generally change. In some cases, a network administrator may implement dynamically assigned static IP addresses. In this case, a DHCP server is used, but it is specifically configured to always assign the same IP address to a particular computer. This allows static IP addresses to be configured centrally, without having to specifically configure each computer on the network in a manual procedure.

In the absence or failure of static or stateful (DHCP) address configurations, an operating system may assign an IP address to a network interface using state-less auto-configuration methods, such as Zeroconf.

Uses of dynamic addressing[edit]

Dynamic IP addresses are most frequently assigned on LANs and broadband networks by DHCP servers. They are used because it avoids the administrative burden of assigning specific static addresses to each device on a network. It also allows many devices to share limited address space on a network if only some of them will be online at a particular time. In most current desktop operating systems, dynamic IP configuration is enabled by default so that a user does not need to manually enter any settings to connect to a network with a DHCP server. DHCP is not the only technology used to assign dynamic IP addresses. Dial-up and some broadband networks use dynamic address features of the Point-to-Point Protocol (PPP).

Sticky dynamic IP address[edit]

A sticky dynamic IP address is an informal term used by cable and DSL Internet access subscribers to describe a dynamically assigned IP address which seldom changes. The addresses are usually assigned with DHCP. Since the modems are usually powered on for extended periods of time, the address leases are usually set to long periods and simply renewed. If a modem is turned off and powered up again before the next expiration of the address lease, it will most likely receive the same IP address.

Address auto-configuration[edit]

RFC 3330 defines an address block, 169.254.0.0/16, for the special use in link-local addressing for IPv4 networks. In IPv6 every interface, whether using static or dynamic address assignments, also receives a local-link address automatically, in the block fe80::/10.

These addresses are only valid on the link, such as a local network segment or point-to-point connection, that a host is connected to. These addresses are not routable and like private addresses cannot be the source or destination of packets traversing the Internet.

When the link-local IPv4 address block was reserved, no standards existed for mechanisms of address autoconfiguration. Filling the void, Microsoft created an implementation that is called Automatic Private IP Addressing (APIPA). Due to Microsoft's market power, APIPA has been deployed on millions of machines and has become a de facto standard in the industry. Many years later, the IETF defined a formal standard for this functionality, RFC 3927, entitled Dynamic Configuration of IPv4 Link-Local Addresses.

Uses of static addressing[edit]

Some infrastructure situations have to use static addressing, such as when finding the Domain Name System (DNS) host that will translate domain names to IP addresses. Static addresses are also convenient, but not absolutely necessary, to locate servers inside an enterprise. An address obtained from a DNS server comes with a time to live or caching time, after which it should be looked up to confirm that it has not changed. Even static IP addresses do change, often as a result of network administration (RFC 2072).

Public addresses[edit]

A public IP address in common parlance is synonymous with a, globally routable unicast IP address. Both IPv4 and IPv6 define address ranges that are reserved for private networks and link-local addressing. The term public IP address often used exclude these types of addresses.

Modifications to IP addressing[edit]

IP blocking and firewalls[edit]

Firewalls protect networks from unauthorized access and are common at every level of today's Internet, controlling access to networks based on the IP address of the computer requesting access. Whether using a blacklist or a whitelist, the IP address that is blocked is the perceived IP address of the client, meaning that if the client is using a proxy server or network address translation, blocking one IP address may block many individual computers.

IP address translation[edit]

Multiple client devices can appear to share IP addresses: either because they are located on a shared-hosting web server or because an IPv4 network address translator (NAT) or proxy server acts as an intermediary agent on behalf of its customers, in which case the real originating IP addresses might be hidden from the server receiving a request. A common practice is to have a NAT hide a large number of IP addresses in a private network. Only the "outside" interface(s) of the NAT need to have Internet-routable addresses.[7]

Most commonly, the NAT device maps TCP or UDP port numbers on the outside to individual private addresses on the inside. Just as a telephone number may have site-specific extensions, the port numbers are site-specific extensions to an IP address.

In small home networks, NAT functions usually take place in a residential gateway device, typically one marketed as a "router". In this scenario, the computers connected to the router would have 'private' IP addresses and the router would have a 'public' address to communicate with the Internet. This type of router allows several computers to share one public IP address.

References[edit]

  1. a b c d Information Sciences Institute, University of Southern California (September 1981). "RFC 791 - Internet Protocol". Internet Engineering Task Force. Archived from the original on September 10, 2015. https://web.archive.org/web/20150910142824if_/http://datatracker.ietf.org/doc/rfc791/. Retrieved November 1, 2019. 
  2. a b Deering, S.; Hinden, R. (July 2017). "RFC 8200 - Internet Protocol, Version 6 (IPv6) Specification". Internet Engineering Task Force. Archived from the original on November 21, 2018. https://web.archive.org/web/20181121071704if_/https://datatracker.ietf.org/doc/rfc8200/. Retrieved November 1, 2019. 
  3. Postel, Jon (January 1980). "RFC 760 - DoD standard Internet Protocol". Internet Engineering Task Force. https://datatracker.ietf.org/doc/rfc760/. Retrieved November 1, 2019. 
  4. Smith, Lucie; Lipner, Ian (February 3, 2011). "Free Pool of IPv4 Address Space Depleted". Number Resource Organization. Archived from the original on November 6, 2018. https://web.archive.org/web/20181106153513if_/https://www.nro.net/ipv4-free-pool-depleted. Retrieved February 3, 2011. 
  5. "APNIC IPv4 Address Pool Reaches Final /8". Asia-Pacific Network Information Centre. April 15, 2011. Archived from the original on March 29, 2016. https://web.archive.org/web/20160329084928if_/https://www.apnic.net/publications/news/2011/final-8. Retrieved November 1, 2019. 
  6. Hinden, R.; Haberman, B. (October 2005). "RFC 4193 - Unique Local IPv6 Unicast Addresses". Internet Engineering Task Force. Archived from the original on September 12, 2018. https://web.archive.org/web/20180912214844if_/https://datatracker.ietf.org/doc/rfc4193/. Retrieved November 1, 2019. 
  7. Comer, Douglas (2000). Internetworking with TCP/IP:Principles, Protocols, and Architectures (4th ed.). Upper Saddle River, New Jersey: Prentice Hall. p. 394. ISBN 0130183806. http://www.cs.purdue.edu/homes/dec/netbooks.html. 

External links[edit]


Logo of Mozilla Firefox, one of the most widely used web browsers
Web browser usage on Wikimedia servers

A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier (URI) and may be a web page, image, video, or other piece of content.[1] Hyperlinks present in resources enable users easily to navigate their browsers to related resources. A web browser can also be defined as an application software or program designed to enable users to access, retrieve and view documents and other resources on the Internet.

Although browsers are primarily intended to access the World Wide Web, they can also be used to access information provided by web servers in private networks or files in file systems. The major web browsers are Firefox, Google Chrome, Internet Explorer, Opera, and Safari.[2]

History[edit]

Main page: History of the web browser
WorldWideWeb for NeXT, released in 1991, was the first web browser.[3]

The first web browser was invented in 1990 by Tim Berners-Lee. It was called WorldWideWeb (no spaces) and was later renamed Nexus.[4]

In 1993, browser software was further innovated by Marc Andreesen with the release of Mosaic (later Netscape), "the world's first popular browser",[5] which made the World Wide Web system easy to use and more accessible to the average person. Andreesen's browser sparked the internet boom of the 1990s.[5] These are the two major milestones in the history of the Web.

The introduction of the NCSA Mosaic web browser in 1993 – one of the first graphical web browsers – led to an explosion in web use. Marc Andreessen, the leader of the Mosaic team at NCSA, soon started his own company, named Netscape, and released the Mosaic-influenced Netscape Navigator in 1994, which quickly became the world's most popular browser, accounting for 90% of all web use at its peak (see usage share of web browsers).

Microsoft responded with its Internet Explorer in 1995 (also heavily influenced by Mosaic), initiating the industry's first browser war. Bundled with Windows, Internet Explorer gained dominance in the web browser market; Internet Explorer usage share peaked at over 95% by 2002.[6]

Opera debuted in 1996; although it has never achieved widespread use, having less than 1% browser usage share as of February 2009 according to Net Applications,[7] having grown to 2.14 in April 2011 its Opera-mini version has an additive share, in April 2011 amounting to 1.11 % of overall browser use, but focused on the fast-growing mobile phone web browser market, being preinstalled on over 40 million phones. It is also available on several other embedded systems, including Nintendo's Wii video game console.

In 1998, Netscape launched what was to become the Mozilla Foundation in an attempt to produce a competitive browser using the open source software model. That browser would eventually evolve into Firefox, which developed a respectable following while still in the beta stage of development; shortly after the release of Firefox 1.0 in late 2004, Firefox (all versions) accounted for 7.4% of browser use.[6] As of August 2011, Firefox has a 27.7% usage share.[7]

Apple's Safari had its first beta release in January 2003; as of April 2011, it has a dominant share of Apple-based web browsing, accounting for just over 7.15% of the entire browser market.[7]

The most recent major entrant to the browser market is Google's Chrome, first released in September 2008. Chrome's take-up has increased significantly year on year, by doubling its usage share from 7.7 percent to 15.5 percent by August 2011. This increase seems largely to be at the expense of Internet Explorer, whose share has tended to decrease from month to month.[8] In December 2011 Google Chrome overtook Internet Explorer 8 as the most widely used web browser. However, when all versions of Internet Explorer are put together, IE is still most popular.[9]

Function[edit]

The primary purpose of a web browser is to bring information resources to the user. This process begins when the user inputs a Uniform Resource Locator (URL), for example http://en.wikipedia.org/, into the browser. The prefix of the URL, the Uniform Resource Identifier or URI, determines how the URL will be interpreted. The most commonly used kind of URI starts with http: and identifies a resource to be retrieved over the Hypertext Transfer Protocol (HTTP). Many browsers also support a variety of other prefixes, such as https: for HTTPS, ftp: for the File Transfer Protocol, and file: for local files. Prefixes that the web browser cannot directly handle are often handed off to another application entirely. For example, mailto: URIs are usually passed to the user's default e-mail application, and news: URIs are passed to the user's default newsgroup reader.

In the case of http, https, file, and others, once the resource has been retrieved the web browser will display it. HTML is passed to the browser's layout engine to be transformed from markup to an interactive document. Aside from HTML, web browsers can generally display any kind of content that can be part of a web page. Most browsers can display images, audio, video, and XML files, and often have plug-ins to support Flash applications and Java applets. Upon encountering a file of an unsupported type or a file that is set up to be downloaded rather than displayed, the browser prompts the user to save the file to disk.

Information resources may contain hyperlinks to other information resources. Each link contains the URI of a resource to go to. When a link is clicked, the browser navigates to the resource indicated by the link's target URI, and the process of bringing content to the user begins again.

Features[edit]

For more details on this topic, see Comparison of web browsers.

Available web browsers range in features from minimal, text-based user interfaces with bare-bones support for HTML to rich user interfaces supporting a wide variety of file formats and protocols. Browsers which include additional components to support e-mail, Usenet news, and Internet Relay Chat (IRC), are sometimes referred to as "Internet suites" rather than merely "web browsers".[10][11][12]

All major web browsers allow the user to open multiple information resources at the same time, either in different browser windows or in different tabs of the same window. Major browsers also include pop-up blockers to prevent unwanted windows from "popping up" without the user's consent.[13][14][15][16]

Most web browsers can display a list of web pages that the user has bookmarked so that the user can quickly return to them. Bookmarks are also called "Favorites" in Internet Explorer. In addition, all major web browsers have some form of built-in web feed aggregator. In Firefox, web feeds are formatted as "live bookmarks" and behave like a folder of bookmarks corresponding to recent entries in the feed.[17] In Opera, a more traditional feed reader is included which stores and displays the contents of the feed.[18]

Furthermore, most browsers can be extended via plug-ins, downloadable components that provide additional features.

User interface[edit]

Most major web browsers have these user interface elements in common:[19]

  • Back and forward buttons to go back to the previous resource and forward respectively.
  • A refresh or reload button to reload the current resource.
  • A stop button to cancel loading the resource. In some browsers, the stop button is merged with the reload button.
  • A home button to return to the user's home page.
  • An address bar to input the Uniform Resource Identifier (URI) of the desired resource and display it.
  • A search bar to input terms into a search engine. In some browsers, the search bar is merged with the address bar.
  • A status bar to display progress in loading the resource and also the URI of links when the cursor hovers over them, and page zooming capability.

Major browsers also possess incremental find features to search within a web page.

Privacy and security[edit]

Most browsers support HTTP Secure and offer quick and easy ways to delete the web cache, cookies, and browsing history. For a comparison of the current security vulnerabilities of browsers, see comparison of web browsers.

Standards support[edit]

Early web browsers supported only a very simple version of HTML. The rapid development of proprietary web browsers led to the development of non-standard dialects of HTML, leading to problems with interoperability. Modern web browsers support a combination of standards-based and de facto HTML and XHTML, which should be rendered in the same way by all browsers.

Extensibility[edit]

A browser extension is a computer program that extends the functionality of a web browser. Every major web browser supports the development of browser extensions.

References[edit]

  1. Jacobs, Ian; Walsh, Norman (15 December 2004). "URI/Resource Relationships". Architecture of the World Wide Web, Volume One. World Wide Web Consortium. http://www.w3.org/TR/webarch/#id-resources. Retrieved 30 June 2009. 
  2. "Browser". Mashable. http://mashable.com/follow/topics/browser/. Retrieved September 2, 2011. 
  3. Stewart, William. "Web Browser History". http://www.livinginternet.com/w/wi_browse.htm. Retrieved 5 May 2009. 
  4. "Tim Berners-Lee: WorldWideWeb, the first Web client". W3.org. http://www.w3.org/People/Berners-Lee/WorldWideWeb.html. Retrieved 2011-12-07. 
  5. a b "Bloomberg Game Changers: Marc Andreesen". Bloomberg.com. 2011-03-17. http://www.bloomberg.com/video/67758394. Retrieved 2011-12-07. 
  6. a b November 24, 2004 (2004-11-24). "Mozilla Firefox Internet Browser Market Share Gains to 7.4%". Search Engine Journal. http://www.searchenginejournal.com/mozilla-firefox-internet-browser-market-share-gains-to-74/1082/. Retrieved 2011-12-07. 
  7. a b c http://gs.statcounter.com/#browser-ww-monthly-201108-201108-bar
  8. "Internet Explorer usage to plummet below 50 percent by mid-2012". September 3, 2011. http://www.digitaltrends.com/web/internet-explorer-usage-to-plummet-below-50-percent-by-mid-2012/attachment/net-applications-browser-market/. Retrieved September 4, 2011. 
  9. "CNN Money claims that Chrome is more popular than IE8". CNN. http://money.cnn.com/2011/12/16/technology/chrome_internet_explorer/?source=cnn_bin. Retrieved December 19, 2011. 
  10. "The SeaMonkey Project". Mozilla Foundation. 7 November 2008. http://www.seamonkey-project.org/. Retrieved 30 June 2009. 
  11. "Cyberdog: Welcome to the 'doghouse!". 5 July 2009. http://www.cyberdog.org/. Retrieved 30 June 2009. 
  12. Teelucksingh, Dev Anand. "Interesting DOS programs". Opus Networkx. http://www.opus.co.tt/dave/internet.htm. Retrieved 30 June 2009. 
  13. Andersen, Starr; Abella, Vincent (15 September 2004). "Part 5: Enhanced Browsing Security". Changes to Functionality in Microsoft Windows XP Service Pack 2. Microsoft. http://technet.microsoft.com/en-us/library/bb457150.aspx#EEAA. Retrieved 30 June 2009. 
  14. "Pop-up blocker". Mozilla Foundation. http://support.mozilla.com/en-US/kb/Pop-up+blocker. Retrieved 30 June 2009. 
  15. "Safari: Using The Pop-Up Blocker". Mac Tips and Tricks. WeHostMacs. 2004. http://www.mactipsandtricks.com/tips/display.lasso?mactip=137. Retrieved 30 June 2009. 
  16. "Simple settings". Opera Tutorials. Opera Software. http://www.opera.com/browser/tutorials/settings/#tabs. Retrieved 30 June 2009. 
  17. Bokma, John. "Mozilla Firefox: RSS and Live Bookmarks". http://johnbokma.com/firefox/rss-and-live-bookmarks.html. Retrieved 30 June 2009. 
  18. "RSS newsfeeds in Opera Mail". Opera Software. http://www.opera.com/mail/rss/. Retrieved 30 June 2009. 
  19. "About Browsers and their Features". SpiritWorks Software Development. http://www.about-the-web.com/shtml/browsers.shtml. Retrieved 5 May 2009. 

External links[edit]

A typical webcast, streaming in an embedded media player

Streaming media is multimedia that is constantly received by and presented to an end-user while being delivered by a streaming provider. With streaming, the client browser or plug-in can start displaying the data before the entire file has been transmitted.[note 1] The name refers to the delivery method of the medium rather than to the medium itself. The distinction is usually applied to media that are distributed over telecommunications networks, as most other delivery systems are either inherently streaming (e.g., radio, television) or inherently non-streaming (e.g., books, video cassettes, audio CDs). The verb 'to stream' is also derived from this term, meaning to deliver media in this manner. Internet television is a commonly streamed medium. Streaming media can be something else other than video and audio. Live closed captioning and stock tickers are considered streaming text, as is Real-Time Text.

Live streaming, delivering live over the Internet, involves a camera for the media, an encoder to digitize the content, a media publisher, and a content delivery network to distribute and deliver the content.

History[edit]

Attempts to display media on computers date back to the earliest days of computing in the mid-20th century. However, little progress was made for several decades, primarily due to the high cost and limited capabilities of computer hardware.

From the late 1980s through the 1990s, consumer-grade personal computers became powerful enough to display various media. The primary technical issues related to streaming were:

However, computer networks were still limited, and media was usually delivered over non-streaming channels, such as by downloading a digital file from a remote server and then saving it to a local drive on the end user's computer or storing it as a digital file and playing it back from CD-ROMs.

During the late 1990s and early 2000s, Internet users saw:

  • greater network bandwidth, especially in the last mile
  • increased access to networks, especially the Internet
  • use of standard protocols and formats, such as TCP/IP, HTTP, and HTML
  • commercialization of the Internet.

Severe Tire Damage was the first band to perform live on the Internet. On June 24, 1993, the band was playing a gig at Xerox PARC while elsewhere in the building, scientists were discussing new technology (the Mbone) for broadcasting on the Internet using multicasting. As proof of their technology, the band was broadcast and could be seen live in Australia and elsewhere.

RealNetworks were also pioneers in the streaming media markets and broadcast one of the earlier audio events over the Internet - a baseball game between the Yankees and Seattle Mariners - in 1995.[1] They went on to launch the first streaming video technology in 1997 with RealPlayer.

When Word Magazine launched in 1995, they featured the first ever streaming soundtracks on the internet. Using local downtown musicians the first music stream was 'Big Wheel' by Karthik Swaminathan and the second being 'When We Were Poor' by Karthik Swaminathan with Marc Ribot and Christine Bard.[citation needed]

Shorty after in the beginning of 1996, Microsoft developed a media player know as ActiveMovie that allowed streaming media and included a proprietary streaming format, which was the successor to the streaming feature later in Windows Media Player 6.4 in 1999.

In June of 1999, Apple also introduced a streaming media format in its Quicktime 4 application. It was later also widely adopted on websites along with RealPlayer and Windows Media streaming formats. The competing formats on websites required each user to download the respective applications for streaming and resulted in many users having to have all three applications on their computer for general compatibility.

Around 2002, the interest in a single, unified, streaming format and the widespread adoption of Adobe Flash on computers prompted the development of a video streaming format through Flash, which is the format used in Flash-based players on many popular video hosting sites today such as YouTube.

Increasing consumer demand for live streaming has prompted YouTube to implement their new Live Streaming service to users.[2]

These advances in computer networking combined with powerful home computers and modern operating systems made streaming media practical and affordable for ordinary consumers. Stand-alone Internet radio devices emerged to offer listeners a no-computer option for listening to audio streams.

In general, multimedia content has a large volume, so media storage and transmission costs are still significant. To offset this somewhat, media are generally compressed for both storage and streaming.

Increasing consumer demand for streaming of high definition (HD) content to different devices in the home has led the industry to develop a number of technologies, such as Wireless HD or ITU-T G.hn, which are optimized for streaming HD content without forcing the user to install new networking cables.

Today, a media stream can be streamed either live or on demand. Live streams are generally provided by a means called true streaming. True streaming sends the information straight to the computer or device without saving the file to a hard disk. On Demand streaming is provided by a means called progressive streaming or progressive download. Progressive streaming saves the file to a hard disk and then is played from that location. On Demand streams are often saved to hard disks and servers for extended amounts of time; while the live streams are only available at one time only (e.g. during the Football game).[3]

With the increasing popularity of mobile devices such as tablets and smartphones that are dependent on battery life, the development of digital media streaming is now focused on formats that do not depend on Adobe Flash —known for its relatively high computer resource usage and thus compromising a mobile device's battery life.

Streaming bandwidth and storage[edit]

A broadband speed of 2.5 Mbit/s or more is recommended for streaming movies, for example to an Apple TV, Google TV or a Sony TV Blu-ray Disc Player, 10 Mbit/s for High Definition content.[4]

Unicast connections require multiple connections from the same streaming server even when it streams the same content

Streaming media storage size is calculated from the streaming bandwidth and length of the media using the following formula (for a single user and file):

storage size (in mebibytes) = length (in seconds) × bit rate (in bit/s) / (8 × 1024 × 1024)[note 2]

Real world example:

One hour of video encoded at 300 kbit/s (this is a typical broadband video as of 2005 and it is usually encoded in a 320 × 240 pixels window size) will be:

(3,600 s × 300,000 bit/s) / (8×1024×1024) requires around 128 MiB of storage.

If the file is stored on a server for on-demand streaming and this stream is viewed by 1,000 people at the same time using a Unicast protocol, the requirement is:

300 kbit/s × 1,000 = 300,000 kbit/s = 300 Mbit/s of bandwidth

This is equivalent to around 135 GB per hour. Using a multicast protocol the server sends out only a single stream that is common to all users. Hence, such a stream would only use 300 kbit/s of serving bandwidth. See below for more information on these protocols.

The calculation for Live streaming is similar.

Assumptions: speed at the encoder, is 500 kbit/s.

If the show lasts for 3 hours with 3,000 viewers, then the calculation is:

Number of MiB transferred = encoder speed (in bit/s) × number of seconds × number of viewers / (8*1024*1024)
Number of MiB transferred = 500,000 (bit/s) × 3 × 3,600 ( = 3 hours) × 3,000 (nbr of viewers) / (8*1024*1024) = 1,931,190 MiB

Codec, bitstream, transport, control[edit]

The audio stream is compressed using an audio codec such as MP3, Vorbis or AAC.

The video stream is compressed using a video codec such as H.264 or VP8.

Encoded audio and video streams are assembled in a container bitstream such as FLV, WebM, ASF or ISMA.

The bitstream is delivered from a streaming server to a streaming client using a transport protocol, such as MMS or RTP.

The streaming client may interact with the streaming server using a control protocol, such as MMS or RTSP.

Protocol issues[edit]

Designing a network protocol to support streaming media raises many issues, such as:

  • Datagram protocols, such as the User Datagram Protocol (UDP), send the media stream as a series of small packets. This is simple and efficient; however, there is no mechanism within the protocol to guarantee delivery. It is up to the receiving application to detect loss or corruption and recover data using error correction techniques. If data is lost, the stream may suffer a dropout.
  • The Real-time Streaming Protocol (RTSP), Real-time Transport Protocol (RTP) and the Real-time Transport Control Protocol (RTCP) were specifically designed to stream media over networks. RTSP runs over a variety of transport protocols, while the latter two are built on top of UDP.
  • Another approach that seems to incorporate both the advantages of using a standard web protocol and the ability to be used for streaming even live content is the HTTP adaptive bitrate streaming. HTTP adaptive bitrate streaming is based on HTTP progressive download, but contrary to the previous approach, here the files are very small, so that they can be compared to the streaming of packets, much like the case of using RTSP and RTP.[5]
  • Reliable protocols, such as the Transmission Control Protocol (TCP), guarantee correct delivery of each bit in the media stream. However, they accomplish this with a system of timeouts and retries, which makes them more complex to implement. It also means that when there is data loss on the network, the media stream stalls while the protocol handlers detect the loss and retransmit the missing data. Clients can minimize this effect by buffering data for display. While delay due to buffering is acceptable in video on demand scenarios, users of interactive applications such as video conferencing will experience a loss of fidelity if the delay that buffering contributes to exceeds 200 ms.[6]
  • Unicast protocols send a separate copy of the media stream from the server to each recipient. Unicast is the norm for most Internet connections, but does not scale well when many users want to view the same television program concurrently.
Multicasting broadcasts the same copy of the multimedia over the entire network to a group of clients
  • Multicast protocols were developed to reduce the data replication (and consequent server/network loads) that occurs when many recipients receive unicast content streams independently. These protocols send a single stream from the source to a group of recipients. Depending on the network infrastructure and type, multicast transmission may or may not be feasible. One potential disadvantage of multicasting is the loss of video on demand functionality. Continuous streaming of radio or television material usually precludes the recipient's ability to control playback. However, this problem can be mitigated by elements such as caching servers, digital set-top boxes, and buffered media players.
  • IP Multicast provides a means to send a single media stream to a group of recipients on a computer network. A multicast protocol, usually Internet Group Management Protocol, is used to manage delivery of multicast streams to the groups of recipients on a LAN. One of the challenges in deploying IP multicast is that routers and firewalls between LANs must allow the passage of packets destined to multicast groups. If the organization that is serving the content has control over the network between server and recipients (i.e., educational, government, and corporate intranets), then routing protocols such as Protocol Independent Multicast can be used to deliver stream content to multiple Local Area Network segments.
  • Peer-to-peer (P2P) protocols arrange for prerecorded streams to be sent between computers. This prevents the server and its network connections from becoming a bottleneck. However, it raises technical, performance, quality, and business issues.


Notes[edit]

Footnotes[edit]

  1. The term "presented" is used in this article in a general sense that includes audio or video playback.
  2. 1 mebibyte = 8 × 1024 × 1024 bits.

Citations[edit]

  1. "RealNetworks Inc.". Funding Universe. http://www.fundinguniverse.com/company-histories/RealNetworks-Inc-Company-History.html. Retrieved 2011-07-23. 
  2. Josh Lowensohn (2008). "YouTube to Offer Live Streaming This Year". http://news.cnet.com/8301-17939_109-9883062-2.html. Retrieved 2011-07-23. 
  3. Grant and Meadows. (2009). Communication Technology Update and Fundamentals 11th Edition. pp.114
  4. Mimimum requirements for Sony TV Blu-ray Disc Player, on advertisement attached to a NetFlix DVDTemplate:Nonspecific
  5. Ch. Z. Patrikakis, N. Papaoulakis, Ch. Stefanoudaki, M. S. Nunes, “Streaming content wars: Download and play strikes back” presented at the Personalization in Media Delivery Platforms Workshop, [218 – 226], Venice, Italy, 2009.
  6. Krasic, C. and Li, K. and Walpole, J., The case for streaming multimedia with TCP, Lecture Notes in Computer Science, pages 213--218, Springer, 2001
Diagram of two computers connected only via a proxy server. The first computer says to the proxy server: "ask the second computer what the time is".
Communication between two computers (shown in grey) connected through a third computer (shown in red) acting as a proxy.

In computer networks, a proxy server is a server (a computer system or an application) that acts as an intermediary for requests from clients seeking resources from other servers. A client connects to the proxy server, requesting some service, such as a file, connection, web page, or other resource available from a different server. The proxy server evaluates the request according to its filtering rules. For example, it may filter traffic by IP address or protocol. If the request is validated by the filter, the proxy provides the resource by connecting to the relevant server and requesting the service on behalf of the client. A proxy server may optionally alter the client's request or the server's response, and sometimes it may serve the request without contacting the specified server. In this case, it 'caches' responses from the remote server, and returns subsequent requests for the same content directly.

The proxy concept was invented in the early days of distributed systems[1] as a way to simplify and control their complexity. Today, most proxies are a web proxy, allowing access to content on the World Wide Web.

Uses[edit]

A proxy server has a large variety of potential purposes, including:

  • To keep machines behind it anonymous, mainly for security.[2]
  • To speed up access to resources (using caching). Web proxies are commonly used to cache web pages from a web server.[3]
  • To apply access policy to network services or content, e.g. to block undesired sites.
  • To access sites prohibited or filtered by your ISP or institution.
  • To log / audit usage, i.e. to provide company employee Internet usage reporting.
  • To bypass security / parental controls.
  • To circumvent Internet filtering to access content otherwise blocked by governments.[4]
  • To scan transmitted content for malware before delivery.
  • To scan outbound content, e.g., for data loss prevention.
  • To allow a web site to make web requests to externally hosted resources (e.g. images, music files, etc.) when cross-domain restrictions prohibit the web site from linking directly to the outside domains.

A proxy server that passes requests and responses unmodified is usually called a gateway or sometimes tunneling proxy.

A proxy server can be placed in the user's local computer or at various points between the user and the destination servers on the Internet.

A reverse proxy is (usually) an Internet-facing proxy used as a front-end to control and protect access to a server on a private network, commonly also performing tasks such as load-balancing, authentication, decryption or caching.

Types of proxy[edit]

Forward proxies[edit]

A proxy server connecting an internal network and the Internet.
A forward proxy taking requests from an internal network and forwarding them to the Internet.

Forward proxies are proxies where the client server names the target server to connect to.[5] Forward proxies are able to retrieve from a wide range of sources (in most cases anywhere on the Internet).

The terms "forward proxy" and "forwarding proxy" are a general description of behavior (forwarding traffic) and thus ambiguous. Except for Reverse proxy, the types of proxies described in this article are more specialized sub-types of the general forward proxy concept.

Open proxies[edit]

Diagram of proxy server connected to the Internet.
An open proxy forwarding requests from and to anywhere on the Internet.

An open proxy is a forwarding proxy server that is accessible by any Internet user. Gordon Lyon estimates there are "hundreds of thousands" of open proxies on the Internet.[6] An anonymous open proxy allows users to conceal their IP address while browsing the Web or using other Internet services. There are varying degrees of anonymity however, as well as a number of methods of 'tricking' the client into revealing itself regardless of the proxy being used.

Reverse proxies[edit]

A proxy server connecting the Internet to an internal network.
A reverse proxy taking requests from the Internet and forwarding them to servers in an internal network. Those making requests connect to the proxy and may not be aware of the internal network.

A reverse proxy (or surrogate) is a proxy server that appears to clients to be an ordinary server. Requests are forwarded to one or more origin servers which handle the request. The response is returned as if it came directly from the proxy server.[5]

Reverse proxies are installed in the neighborhood of one or more web servers. All traffic coming from the Internet and with a destination of one of the neighborhood's web servers goes through the proxy server. The use of "reverse" originates in its counterpart "forward proxy" since the reverse proxy sits closer to the web server and serves only a restricted set of websites.

There are several reasons for installing reverse proxy servers:

  • Encryption / SSL acceleration: when secure web sites are created, the SSL encryption is often not done by the web server itself, but by a reverse proxy that is equipped with SSL acceleration hardware. See Secure Sockets Layer. Furthermore, a host can provide a single "SSL proxy" to provide SSL encryption for an arbitrary number of hosts; removing the need for a separate SSL Server Certificate for each host, with the downside that all hosts behind the SSL proxy have to share a common DNS name or IP address for SSL connections. This problem can partly be overcome by using the SubjectAltName feature of X.509 certificates.
  • Load balancing: the reverse proxy can distribute the load to several web servers, each web server serving its own application area. In such a case, the reverse proxy may need to rewrite the URLs in each web page (translation from externally known URLs to the internal locations).
  • Serve/cache static content: A reverse proxy can offload the web servers by caching static content like pictures and other static graphical content.
  • Compression: the proxy server can optimize and compress the content to speed up the load time.
  • Spoon feeding: reduces resource usage caused by slow clients on the web servers by caching the content the web server sent and slowly "spoon feeding" it to the client. This especially benefits dynamically generated pages.
  • Security: the proxy server is an additional layer of defense and can protect against some OS and WebServer specific attacks. However, it does not provide any protection to attacks against the web application or service itself, which is generally considered the larger threat.
  • Extranet Publishing: a reverse proxy server facing the Internet can be used to communicate to a firewalled server internal to an organization, providing extranet access to some functions while keeping the servers behind the firewalls. If used in this way, security measures should be considered to protect the rest of your infrastructure in case this server is compromised, as its web application is exposed to attack from the Internet.

Performance Enhancing Proxies[edit]

A proxy that is designed to mitigate specific link related issues or degradations. PEPs (Performance Enhancing Proxies) are typically used to improve TCP performance in the presence of high Round Trip Times (RTTs) and wireless links with high packet loss. They are also frequently used for highly asynchronous links featuring very different upload and download rates.

Uses of proxy servers[edit]

Filtering[edit]

A content-filtering web proxy server provides administrative control over the content that may be relayed through the proxy. It is commonly used in both commercial and non-commercial organizations (especially schools) to ensure that Internet usage conforms to acceptable use policy. In some cases users can circumvent the proxy, since there are services designed to proxy information from a filtered website through a non filtered site to allow it through the user's proxy.[7]

A content filtering proxy will often support user authentication, to control web access. It also usually produces logs, either to give detailed information about the URLs accessed by specific users, or to monitor bandwidth usage statistics. It may also communicate to daemon-based and/or ICAP-based antivirus software to provide security against virus and other malware by scanning incoming content in real time before it enters the network.

Many work places, schools, and colleges restrict the web sites and online services that are made available in their buildings. This is done either with a specialized proxy, called a content filter (both commercial and free products are available), or by using a cache-extension protocol such as ICAP, that allows plug-in extensions to an open caching architecture.

Some common methods used for content filtering include: URL or DNS blacklists, URL regex filtering, MIME filtering, or content keyword filtering. Some products have been known to employ content analysis techniques to look for traits commonly used by certain types of content providers.

Requests made to the open internet must first pass through an outbound proxy filter. The web-filtering company provides a database of URL patterns (regular expressions) with associated content attributes. This database is updated weekly by site-wide subscription, much like a virus filter subscription. The administrator instructs the web filter to ban broad classes of content (such as sports, pornography, online shopping, gambling, or social networking). Requests that match a banned URL pattern are rejected immediately.

Assuming the requested URL is acceptable, the content is then fetched by the proxy. At this point a dynamic filter may be applied on the return path. For example, JPEG files could be blocked based on fleshtone matches, or language filters could dynamically detect unwanted language. If the content is rejected then an HTTP fetch error is returned and nothing is cached.

Extranet Publishing: a reverse proxy server facing the Internet can be used to communicate to a firewalled server internal to an organization, providing extranet access to some functions while keeping the servers behind the firewalls. If used in this way, security measures should be considered to protect the rest of your infrastructure in case this server is compromised, as its web application is exposed to attack from the Internet

Most web filtering companies use an internet-wide crawling robot that assesses the likelihood that a content is a certain type. The resultant database is then corrected by manual labor based on complaints or known flaws in the content-matching algorithms.

Web filtering proxies are not able to peer inside secure sockets HTTP transactions, assuming the chain-of-trust of SSL/TLS has not been tampered with. As a result, users wanting to bypass web filtering will typically search the internet for an open and anonymous HTTPS transparent proxy. They will then program their browser to proxy all requests through the web filter to this anonymous proxy. Those requests will be encrypted with https. The web filter cannot distinguish these transactions from, say, a legitimate access to a financial website. Thus, content filters are only effective against unsophisticated users.

As mentioned above, the SSL/TLS chain-of-trust does rely on trusted root certificate authorities; in a workplace setting where the client is managed by the organization, trust might be granted to a root certificate whose private key is known to the proxy. Concretely, a root certificate generated by the proxy is installed into the browser CA list by IT staff. In such scenarios, proxy analysis of the contents of a SSL/TLS transaction becomes possible. The proxy is effectively operating a man-in-the-middle attack, allowed by the client's trust of a root certificate the proxy owns.

A special case of web proxies is "CGI proxies". These are web sites that allow a user to access a site through them. They generally use PHP or CGI to implement the proxy functionality. These types of proxies are frequently used to gain access to web sites blocked by corporate or school proxies. Since they also hide the user's own IP address from the web sites they access through the proxy, they are sometimes also used to gain a degree of anonymity, called "Proxy Avoidance".

Caching[edit]

A caching proxy server accelerates service requests by retrieving content saved from a previous request made by the same client or even other clients. Caching proxies keep local copies of frequently requested resources, allowing large organizations to significantly reduce their upstream bandwidth usage and costs, while significantly increasing performance. Most ISPs and large businesses have a caching proxy. Caching proxies were the first kind of proxy server.

Some poorly-implemented caching proxies have had downsides (e.g., an inability to use user authentication). Some problems are described in RFC 3143 (Known HTTP Proxy/Caching Problems).

Another important use of the proxy server is to reduce the hardware cost. An organization may have many systems on the same network or under control of a single server, prohibiting the possibility of an individual connection to the Internet for each system. In such a case, the individual systems can be connected to one proxy server, and the proxy server connected to the main server. An example of a software caching proxy is Squid.

DNS proxy[edit]

A DNS proxy server takes DNS queries from a (usually local) network and forwards them to an Internet Domain Name Server. It may also cache DNS records.

Bypassing filters and censorship[edit]

If the destination server filters content based on the origin of the request, the use of a proxy can circumvent this filter. For example, a server using IP-based geolocation to restrict its service to a certain country can be accessed using a proxy located in that country to access the service.

Likewise, a badly configured proxy can provide access to a network otherwise isolated from the Internet.[6]

Logging and eavesdropping[edit]

Proxies can be installed in order to eavesdrop upon the data-flow between client machines and the web. All content sent or accessed – including passwords submitted and cookies used – can be captured and analyzed by the proxy operator. For this reason, passwords to online services (such as webmail and banking) should always be exchanged over a cryptographically secured connection, such as SSL.

By chaining proxies which do not reveal data about the original requester, it is possible to obfuscate activities from the eyes of the user's destination. However, more traces will be left on the intermediate hops, which could be used or offered up to trace the user's activities. If the policies and administrators of these other proxies are unknown, the user may fall victim to a false sense of security just because those details are out of sight and mind.

In what is more of an inconvenience than a risk, proxy users may find themselves being blocked from certain Web sites, as numerous forums and Web sites block IP addresses from proxies known to have spammed or trolled the site. Proxy bouncing can be used to maintain your privacy.


Gateways to private networks[edit]

Proxy servers can perform a role similar to a network switch in linking two networks.

Accessing services anonymously[edit]

An anonymous proxy server (sometimes called a web proxy) generally attempts to anonymize web surfing. There are different varieties of anonymizers. The destination server (the server that ultimately satisfies the web request) receives requests from the anonymizing proxy server, and thus does not receive information about the end user's address. However, the requests are not anonymous to the anonymizing proxy server, and so a degree of trust is present between the proxy server and the user. Many of them are funded through a continued advertising link to the user.

Access control: Some proxy servers implement a logon requirement. In large organizations, authorized users must log on to gain access to the web. The organization can thereby track usage to individuals.

Some anonymizing proxy servers may forward data packets with header lines such as HTTP_VIA, HTTP_X_FORWARDED_FOR, or HTTP_FORWARDED, which may reveal the IP address of the client. Other anonymizing proxy servers, known as elite or high anonymity proxies, only include the REMOTE_ADDR header with the IP address of the proxy server, making it appear that the proxy server is the client. A website could still suspect a proxy is being used if the client sends packets which include a cookie from a previous visit that did not use the high anonymity proxy server. Clearing cookies, and possibly the cache, would solve this problem.

Implementations of proxies[edit]

Web proxy[edit]

A web proxy passes along http protocol requests like any other proxy server. However, the web proxy accepts target URLs within a user's browser window, processes the request, and then displays the contents of the requested URL immediately back within the users browser. This is generally quite different than a corporate intranet proxy which some people mistakenly refer to as a web proxy.

Suffix proxy[edit]

A suffix proxy allows a user to access web content by appending the name of the proxy server to the URL of the requested content (e.g. "en.wikipedia.org.SuffixProxy.com"). Suffix proxy servers are easier to use than regular proxy servers. But do not offer anonymity and the primary use is bypassing web filters; however, this is rarely used due to more advanced web filters.

Transparent proxy[edit]

Also known as an intercepting proxy or forced proxy, a transparent proxy intercepts normal communication without requiring any special client configuration. Clients need not be aware of the existence of the proxy. A transparent proxy is normally located between the client and the Internet, with the proxy performing some of the functions of a gateway or router.[8]

RFC 2616 (Hypertext Transfer Protocol—HTTP/1.1) offers standard definitions:

"A 'transparent proxy' is a proxy that does not modify the request or response beyond what is required for proxy authentication and identification".
"A 'non-transparent proxy' is a proxy that modifies the request or response in order to provide some added service to the user agent, such as group annotation services, media type transformation, protocol reduction, or anonymity filtering".

In 2009 a security flaw in the way that transparent proxies operate was published by Robert Auger,[9] and the Computer Emergency Response Team issued an advisory listing dozens of affected transparent and intercepting proxy servers.[10]

Purpose[edit]

Intercepting proxies are commonly used in businesses to prevent avoidance of acceptable use policy, and to ease administrative burden, since no client browser configuration is required. This second reason however is mitigated by features such as Active Directory group policy, or DHCP and automatic proxy detection.

Intercepting proxies are also commonly used by ISPs in some countries to save upstream bandwidth and improve customer response times by caching. This is more common in countries where bandwidth is more limited (e.g. island nations) or must be paid for.

Issues[edit]

The diversion / interception of a TCP connection creates several issues. Firstly the original destination IP and port must somehow be communicated to the proxy. This is not always possible (e.g. where the gateway and proxy reside on different hosts). There is a class of cross site attacks which depend on certain behaviour of intercepting proxies that do not check or have access to information about the original (intercepted) destination. This problem can be resolved by using an integrated packet-level and application level appliance or software which is then able to communicate this information between the packet handler and the proxy.

Intercepting also creates problems for HTTP authentication, especially connection-oriented authentication such as NTLM, since the client browser believes it is talking to a server rather than a proxy. This can cause problems where an intercepting proxy requires authentication, then the user connects to a site which also requires authentication.

Finally intercepting connections can cause problems for HTTP caches, since some requests and responses become uncacheble by a shared cache.

Therefore intercepting connections is generally discouraged. However due to the simplicity of deploying such systems, they are in widespread use.

Implementation methods[edit]

Interception can be performed using Cisco's WCCP (Web Cache Control Protocol). This proprietary protocol resides on the router and is configured from the cache, allowing the cache to determine what ports and traffic is sent to it via transparent redirection from the router. This redirection can occur in one of two ways: GRE Tunneling (OSI Layer 3) or MAC rewrites (OSI Layer 2).

Once traffic reaches the proxy machine itself interception is commonly performed with NAT (Network Address Translation). Such setups are invisible to the client browser, but leave the proxy visible to the web server and other devices on the internet side of the proxy. Recent Linux and some BSD releases provide TPROXY (transparent proxy) which performs IP-level (OSI Layer 3) transparent interception and spoofing of outbound traffic, hiding the proxy IP address from other network devices.

Detection[edit]

There are several methods that can often be used to detect the presence of an intercepting proxy server:

  • By comparing the client's external IP address to the address seen by an external web server, or sometimes by examining the HTTP headers received by a server. A number of sites have been created to address this issue, by reporting the user's IP address as seen by the site back to the user in a web page.[7]
  • By comparing the sequence of network hops reported by a tool such as traceroute for a proxied protocol such as http (port 80) with that for a non proxied protocol such as SMTP (port 25). [8],[9]
  • By attempting to make a connection to an IP address at which there is known to be no server. The proxy will accept the connection and then attempt to proxy it on. When the proxy finds no server to accept the connection it may return an error message or simply close the connection to the client. This difference in behaviour is simple to detect. For example most web browsers will generate a browser created error page in the case where they cannot connect to an HTTP server but will return a different error in the case where the connection is accepted and then closed.[11]
  • By serving the end-user specially programmed flash files that send HTTP calls back to their server.

Tor onion proxy software[edit]

Screenshot of computer program showing computer locations on a world map.
The Vidalia Tor-network map.

Tor (short for The Onion Router) is a system intended to enable online anonymity.[12] Tor client software routes Internet traffic through a worldwide volunteer network of servers in order to conceal a user's location or usage from someone conducting network surveillance or traffic analysis. Using Tor makes it more difficult to trace Internet activity, including "visits to Web sites, online posts, instant messages and other communication forms", back to the user.[12] It is intended to protect users' personal freedom, privacy, and ability to conduct confidential business by keeping their internet activities from being monitored.

"Onion routing" refers to the layered nature of the encryption service: The original data are encrypted and re-encrypted multiple times, then sent through successive Tor relays, each one of which decrypts a "layer" of encryption before passing the data on to the next relay and ultimately the destination. This reduces the possibility of the original data being unscrambled or understood in transit.[13]

The Tor client is free software, and there are no additional charges to use the network.

I2P anonymous proxy[edit]

The I2P anonymous network ('I2P') is a proxy network aiming at online anonymity. It implements garlic routing, which is an enhancement of Tor's onion routing. I2P is fully distributed and works by encrypting all communications in various layers and relaying them through a network of routers run by volunteers in various locations. By keeping the source of the information hidden, I2P offers censorship resistance. The goals of I2P are to protect users' personal freedom, privacy, and ability to conduct confidential business.

Each user of I2P runs an I2P router on their computer (node). The I2P router takes care of finding other peers and building anonymizing tunnels through them. I2P provides proxies for all protocols (HTTP, irc, SOCKS, ...).

The software is free and open-source, and the network is free of charge to use.

References[edit]

  1. Shapiro, Marc (May 1986). "Structure and encapsulation in distributed systems: the Proxy Principle". Int. Conf. on Distributed Computer Sys.: 198–204. http://hal.inria.fr/docs/00/44/46/51/PDF/SEDSPP_icdcs86.pdf. Retrieved 4 September 2011. 
  2. "Firewall and Proxy Server HOWTO". tldp.org. http://tldp.org/HOWTO/Firewall-HOWTO-11.html. Retrieved 4 September 2011. "The proxy server is, above all, a security device." 
  3. Thomas, Keir (2006). Beginning Ubuntu Linux: From Novice to Professional. Apress. ISBN 9781590596272. "A proxy server helps speed up Internet access by storing frequently accessed pages" 
  4. "2010 Circumvention Tool Usage Report". The Berkman Center for Internet & Society at Harvard University. October 2010. http://cyber.law.harvard.edu/sites/cyber.law.harvard.edu/files/2010_Circumvention_Tool_Usage_Report.pdf. 
  5. a b "Forward and Reverse Proxies". httpd mod_proxy. Apache. http://httpd.apache.org/docs/2.0/mod/mod_proxy.html#forwardreverse. Retrieved 20 December 2010. 
  6. a b Lyon, Gordon (2008). Nmap network scanning. US: Insecure. p. 270. ISBN 9780979958717. 
  7. "Using a Ninjaproxy to get through a filtered proxy.". advanced filtering mechanics. TSNP. http://sitevana.com/webtech/. Retrieved 17 September 2011. 
  8. "What is an intercepting proxy?". uCertify. 28 February 2010. http://www.ucertify.com/article/what-is-an-intercepting-proxy.html. Retrieved 4 September 2011. 
  9. "Socket Capable Browser Plugins Result In Transparent Proxy Abuse". The Security Practice. 9 March 2009. http://www.thesecuritypractice.com/the_security_practice/2009/03/socket-capable-browser-plugins-result-in-transparent-proxy-abuse.html. Retrieved 14 August 2010. 
  10. "Vulnerability Note VU#435052". US CERT. 23 February 2009. http://www.kb.cert.org/vuls/id/435052. Retrieved 14 August 2010. 
  11. Wessels, Duane (2004). Squid The Definitive Guide. O'Reilly. pp. 130. ISBN 9780596001629. 
  12. a b Glater, Jonathan (25 January 2006). "Privacy for People Who Don't Show Their Navels". The New York Times. http://www.nytimes.com/2006/01/25/technology/techspecial2/25privacy.html?_r=1. Retrieved 4 August 2011. 
  13. The Tor Project. "Tor: anonymity online". https://www.torproject.org/. Retrieved 9 January 2011. 

External links[edit]


Intellectual Property and the Internet
Proxy servers Search engines Anonymizers
The results of a search for the term "lunar eclipse" in a web-based image search engine

A search engine is a software system that is designed to search for information placed on web pages on the Internet. The response from the service is generally presented in a vertical list on what is most often referred to as a results page. The information may be a mix of web pages, images, videos, maps and other types of files. Some search engines also mine data from public databases or open directories. Unlike web directories, which are maintained only by human editors, search engines also maintain real-time information by running a web crawler which applies their search algorithm to all new and changed web pages it finds. Internet content that is not capable of being searched by a web search engine is generally described as the "deep web."

History[edit]

Timeline
Year Engine Current status
1993 W3Catalog Inactive
Aliweb Inactive
JumpStation Inactive
World-Wide Web Worm Inactive
1994 WebCrawler Active (Aggregator)
Go.com Inactive (redirects to Disney)
Lycos Active
Infoseek Inactive (redirects to Disney)
1995 Daum Active
Magellan Inactive
Excite Active
SAPO Active
Yahoo! (directory) Active (as Yahoo! Search since 2004)
AltaVista Inactive (Yahoo! acquisition: 2003, redirected: 2013)
1996 Dogpile Active (Aggregator)
Inktomi Inactive (acquired by Yahoo!)
HotBot Active (lycos.com)
Ask Jeeves Active (rebranded as Ask.com)
1997 Northern Light Inactive
Yandex Active
1998 Google Active
Ixquick Active (alias of Startpage)
MSN Search Active (as Bing)
empas Inactive (merged with NATE)
1999 AlltheWeb Inactive (redirects to Yahoo!)
GenieKnows Active (rebranded Yellowee.com)
Naver Active
Teoma Inactive (redirects to Ask.com)
Vivisimo Inactive
2000 Baidu Active
Exalead Active
Gigablast Active
2001 Kartoo Inactive
2003 Info.com Active
Scroogle Inactive
2004 Yahoo! Search Active (originally Yahoo! (directory), 1995)
A9.com Inactive
Sogou Active
2005 AOL Search Active
SearchMe Inactive
2006 Soso Inactive (redirects to Sogou)
Quaero Inactive
Search.com Active
ChaCha Inactive
Ask.com Active (originally Ask Jeeves, 1996)
Live Search Active (as Bing, originally MSN Search, 1998)
2007 wikiseek Inactive
Sproose Inactive
Wikia Search Inactive
Blackle.com Active (alias of Google)
2008 Powerset Inactive (redirects to Bing)
Picollator Inactive
Viewzi Inactive
Boogami Inactive
LeapFish Inactive
Forestle Inactive (redirects to Ecosia)
DuckDuckGo Active
2009 Bing Active (originally MSN Search, 1998)
Yebol Inactive
Mugurdy Inactive
Scout (by Goby) Active
NATE Active
Ecosia Active
2010 Blekko Inactive (sold to IBM)
Cuil Inactive
Yandex (English) Active
2011 YaCy Active (Peer-to-peer search engine)
2012 Volunia Inactive
2013 Qwant Active
Infoseek Inactive (redirects to Disney)
2014 Egerin Active (Kurdish/Sorani search engine)
2015 Cliqz Active (browser-integrated search engine)
2016 Search Encrypt Active

Internet search engines themselves predate the debut of the Web in December 1990. The Who is user search dates back to 1982 [1] and the Knowbot Information Service multi-network user search was first implemented in 1989.[2] The first well documented search engine that searched content files, namely FTP files was Archie, which debuted on 10 September 1990.[3]

Prior to September 1993 the World Wide Web was entirely indexed by hand. There was a list of webservers edited by Tim Berners-Lee and hosted on the CERN webserver. One Google.nl snapshot of the list in 1992 remains,[4] but as more and more web servers went online the central list could no longer keep up. On the NCSA site, new servers were announced under the title "What's New!"[5]

The first tool used for searching content (as opposed to users) on the Internet was Archie.[6] The name stands for "archive" without the "v". It was created by Alan Emtage, Bill Heelan and J. Peter Deutsch, computer science students at McGill University in Montreal, Quebec, Canada. The program downloaded the directory listings of all the files located on public anonymous FTP (File Transfer Protocol) sites, creating a searchable database of file names; however, Archie Search Engine did not index the contents of these sites since the amount of data was so limited it could be readily searched manually.

The rise of Gopher (created in 1991 by Mark McCahill at the University of Minnesota) led to two new search programs, Veronica and Jughead. Like Archie, they searched the file names and titles stored in Gopher index systems. Veronica (Very Easy Rodent-Oriented Net-wide Index to Computerized Archives) provided a keyword search of most Gopher menu titles in the entire Gopher listings. Jughead (Jonzy's Universal Gopher Hierarchy Excavation And Display) was a tool for obtaining menu information from specific Gopher servers. While the name of the search engine "Archie Search Engine" was not a reference to the Archie comic book series, "Veronica" and "Jughead" are characters in the series, thus referencing their predecessor.

In the summer of 1993, no search engine existed for the web, though numerous specialized catalogues were maintained by hand. Oscar Nierstrasz at the University of Geneva wrote a series of Perl scripts that periodically mirrored these pages and rewrote them into a standard format. This formed the basis for W3Catalog, the web's first primitive search engine, released on September 2, 1993.[7]

In June 1993, Matthew Gray, then at MIT, produced what was probably the first web robot, the Perl-based World Wide Web Wanderer, and used it to generate an index called 'Wandex'. The purpose of the Wanderer was to measure the size of the World Wide Web, which it did until late 1995. The web's second search engine Aliweb appeared in November 1993. Aliweb did not use a web robot, but instead depended on being notified by website administrators of the existence at each site of an index file in a particular format.

NCSA's Mosaic™ - Mosaic (web browser) wasn't the first Web browser. But it was the first to make a major splash. In November 1993, Mosaic v 1.0 broke away from the small pack of existing browsers by including features—like icons, bookmarks, a more attractive interface, and pictures—that made the software easy to use and appealing to "non-geeks."

JumpStation (created in December 1993[8] by Jonathon Fletcher) used a web robot to find web pages and to build its index, and used a web form as the interface to its query program. It was thus the first WWW resource-discovery tool to combine the three essential features of a web search engine (crawling, indexing, and searching) as described below. Because of the limited resources available on the platform it ran on, its indexing and hence searching were limited to the titles and headings found in the web pages the crawler encountered.

One of the first "all text" crawler-based search engines was WebCrawler, which came out in 1994. Unlike its predecessors, it allowed users to search for any word in any webpage, which has become the standard for all major search engines since. It was also the first one widely known by the public. Also in 1994, Lycos (which started at Carnegie Mellon University) was launched and became a major commercial endeavor.

Soon after, many search engines appeared and vied for popularity. These included Magellan, Excite, Infoseek, Inktomi, Northern Light, and AltaVista. Yahoo! was among the most popular ways for people to find web pages of interest, but its search function operated on its web directory, rather than its full-text copies of web pages. Information seekers could also browse the directory instead of doing a keyword-based search.

In 1996, Netscape was looking to give a single search engine an exclusive deal as the featured search engine on Netscape's web browser. There was so much interest that instead Netscape struck deals with five of the major search engines: for $5 million a year, each search engine would be in rotation on the Netscape search engine page. The five engines were Yahoo!, Magellan, Lycos, Infoseek, and Excite.[9][10]

Google adopted the idea of selling search terms in 1998, from a small search engine company named goto.com. This move had a significant effect on the SE business, which went from struggling to one of the most profitable businesses in the internet.[11]

Search engines were also known as some of the brightest stars in the Internet investing frenzy that occurred in the late 1990s.[12] Several companies entered the market spectacularly, receiving record gains during their initial public offerings. Some have taken down their public search engine, and are marketing enterprise-only editions, such as Northern Light. Many search engine companies were caught up in the dot-com bubble, a speculation-driven market boom that peaked in 1999 and ended in 2001.

  • Around 2000, Google's search engine rose to prominence.[13] The company achieved better results for many searches with an innovation called PageRank, as was explained in the paper Anatomy of a Search Engine written by Sergey Brin and Larry Page, the later founders of Google.[14] This iterative algorithm ranks web pages based on the number and PageRank of other web sites and pages that link there, on the premise that good or desirable pages are linked to more than others. Google also maintained a minimalist interface to its search engine. In contrast, many of its competitors embedded a search engine in a web portal. In fact, Google search engine became so popular that spoof engines emerged such as Mystery Seeker.

By 2000, Yahoo! was providing search services based on Inktomi's search engine. Yahoo! acquired Inktomi in 2002, and Overture (which owned AlltheWeb and AltaVista) in 2003. Yahoo! switched to Google's search engine until 2004, when it launched its own search engine based on the combined technologies of its acquisitions.

Microsoft first launched MSN Search in the fall of 1998 using search results from Inktomi. In early 1999 the site began to display listings from Looksmart, blended with results from Inktomi. For a short time in 1999, MSN Search used results from AltaVista instead. In 2004, Microsoft began a transition to its own search technology, powered by its own web crawler (called msnbot).

Microsoft's rebranded search engine, Bing, was launched on June 1, 2009. On July 29, 2009, Yahoo! and Microsoft finalized a deal in which Yahoo! Search would be powered by Microsoft Bing technology.

Approach[edit]

A search engine maintains the following processes in near real time:

  1. Web crawling
  2. Indexing
  3. Searching[15]

Web search engines get their information by web crawling from site to site. The "spider" checks for the standard filename robots.txt, addressed to it, before sending certain information back to be indexed depending on many factors, such as the titles, page content, JavaScript, Cascading Style Sheets (CSS), headings, as evidenced by the standard HTML markup of the informational content, or its metadata in HTML meta tags. "[N]o web crawler may actually crawl the entire reachable web. Due to infinite websites, spider traps, spam, and other exigencies of the real web, crawlers instead apply a crawl policy to determine when the crawling of a site should be deemed sufficient. Some sites are crawled exhaustively, while others are crawled only partially".[16]

Indexing means associating words and other definable tokens found on web pages to their domain names and HTML-based fields. The associations are made in a public database, made available for web search queries. A query from a user can be a single word. The index helps find information relating to the query as quickly as possible.[15] Some of the techniques for indexing, and caching are trade secrets, whereas web crawling is a straightforward process of visiting all sites on a systematic basis.

Between visits by the spider, the cached version of page (some or all the content needed to render it) stored in the search engine working memory is quickly sent to an inquirer. If a visit is overdue, the search engine can just act as a web proxy instead. In this case the page may differ from the search terms indexed.[15] The cached page holds the appearance of the version whose words were indexed, so a cached version of a page can be useful to the web site when the actual page has been lost, but this problem is also considered a mild form of linkrot.

High-level architecture of a standard Web crawler

Typically when a user enters a query into a search engine it is a few keywords.[17] The index already has the names of the sites containing the keywords, and these are instantly obtained from the index. The real processing load is in generating the web pages that are the search results list: Every page in the entire list must be weighted according to information in the indexes.[15] Then the top search result item requires the lookup, reconstruction, and markup of the snippets showing the context of the keywords matched. These are only part of the processing each search results web page requires, and further pages (next to the top) require more of this post processing.

Beyond simple keyword lookups, search engines offer their own GUI- or command-driven operators and search parameters to refine the search results. These provide the necessary controls for the user engaged in the feedback loop users create by filtering and weighting while refining the search results, given the initial pages of the first search results. For example, from 2007 the Google.com search engine has allowed one to filter by date by clicking "Show search tools" in the leftmost column of the initial search results page, and then selecting the desired date range.[18] It's also possible to weight by date because each page has a modification time. Most search engines support the use of the boolean operators AND, OR and NOT to help end users refine the search query. Boolean operators are for literal searches that allow the user to refine and extend the terms of the search. The engine looks for the words or phrases exactly as entered. Some search engines provide an advanced feature called proximity search, which allows users to define the distance between keywords.[15] There is also concept-based searching where the research involves using statistical analysis on pages containing the words or phrases you search for. As well, natural language queries allow the user to type a question in the same form one would ask it to a human.[19] A site like this would be ask.com.[20]

The usefulness of a search engine depends on the relevance of the result set it gives back. While there may be millions of web pages that include a particular word or phrase, some pages may be more relevant, popular, or authoritative than others. Most search engines employ methods to rank the results to provide the "best" results first. How a search engine decides which pages are the best matches, and what order the results should be shown in, varies widely from one engine to another.[15] The methods also change over time as Internet usage changes and new techniques evolve. There are two main types of search engine that have evolved: one is a system of predefined and hierarchically ordered keywords that humans have programmed extensively. The other is a system that generates an "inverted index" by analyzing texts it locates. This first form relies much more heavily on the computer itself to do the bulk of the work.

Most Web search engines are commercial ventures supported by advertising revenue and thus some of them allow advertisers to have their listings ranked higher in search results for a fee. Search engines that do not accept money for their search results make money by running search related ads alongside the regular search engine results. The search engines make money every time someone clicks on one of these ads.[21]

Market share[edit]

Google is the world's most popular search engine, with a market share of 74.52 percent as of February, 2018.[22]

The world's most popular search engines (with >1% market share) are:

Search engine Market share in February 2018
Google Template:Bartable
Bing Template:Bartable
Baidu Template:Bartable
Yahoo! Template:Bartable

East Asia and Russia[edit]

In some East Asian countries and Russia, Google is not the most popular search engine.

In Russia, Yandex commands a marketshare of 61.9 percent, compared to Google's 28.3 percent.[23] In China, Baidu is the most popular search engine.[24] South Korea's homegrown search portal, Naver, is used for 70 percent of online searches in the country.[25] Yahoo! Japan and Yahoo! Taiwan are the most popular avenues for internet search in Japan and Taiwan, respectively.[26]

Europe[edit]

Most countries' markets in Western Europe are dominated by Google, except for Czech Republic, where Seznam is a strong competitor.[27]

Search engine bias[edit]

Although search engines are programmed to rank websites based on some combination of their popularity and relevancy, empirical studies indicate various political, economic, and social biases in the information they provide[28][29] and the underlying assumptions about the technology.[30] These biases can be a direct result of economic and commercial processes (e.g., companies that advertise with a search engine can become also more popular in its organic search results), and political processes (e.g., the removal of search results to comply with local laws).[31] For example, Google will not surface certain neo-Nazi websites in France and Germany, where Holocaust denial is illegal.

Biases can also be a result of social processes, as search engine algorithms are frequently designed to exclude non-normative viewpoints in favor of more "popular" results.[32] Indexing algorithms of major search engines skew towards coverage of U.S.-based sites, rather than websites from non-U.S. countries.[29]

Google Bombing is one example of an attempt to manipulate search results for political, social or commercial reasons.

Several scholars have studied the cultural changes triggered by search engines,[33] and the representation of certain controversial topics in their results, such as terrorism in Ireland[34] and conspiracy theories.[35]

Customized results and filter bubbles[edit]

Many search engines such as Google and Bing provide customized results based on the user's activity history. This leads to an effect that has been called a filter bubble. The term describes a phenomenon in which websites use algorithms to selectively guess what information a user would like to see, based on information about the user (such as location, past click behaviour and search history). As a result, websites tend to show only information that agrees with the user's past viewpoint, effectively isolating the user in a bubble that tends to exclude contrary information. Prime examples are Google's personalized search results and Facebook's personalized news stream. According to Eli Pariser, who coined the term, users get less exposure to conflicting viewpoints and are isolated intellectually in their own informational bubble. Pariser related an example in which one user searched Google for "BP" and got investment news about British Petroleum while another searcher got information about the Deepwater Horizon oil spill and that the two search results pages were "strikingly different".[36][37][38] The bubble effect may have negative implications for civic discourse, according to Pariser.[39] Since this problem has been identified, competing search engines have emerged that seek to avoid this problem by not tracking or "bubbling" users, such as DuckDuckGo. Other scholars do not share Pariser's view, finding the evidence in support of his thesis unconvincing.[40]

Christian, Islamic and Jewish search engines[edit]

The global growth of the Internet and electronic media in the Arab and Muslim World during the last decade has encouraged Islamic adherents in the Middle East and Asian sub-continent, to attempt their own search engines, their own filtered search portals that would enable users to perform safe searches. More than usual safe search filters, these Islamic web portals categorizing websites into being either "halal" or "haram", based on modern, expert, interpretation of the "Law of Islam". ImHalal came online in September 2011. Halalgoogling came online in July 2013. These use haram filters on the collections from Google and Bing (and others).[41]

While lack of investment and slow pace in technologies in the Muslim World has hindered progress and thwarted success of an Islamic search engine, targeting as the main consumers Islamic adherents, projects like Muxlim, a Muslim lifestyle site, did receive millions of dollars from investors like Rite Internet Ventures, and it also faltered. Other religion-oriented search engines are Jewgle, the Jewish version of Google, and SeekFind.org, which is Christian. SeekFind filters sites that attack or degrade their faith.[42]

Search engine submission