Chinese martial arts, also referred to by the Mandarin Chinese term wushu (simplified Chinese: 武术; traditional Chinese: 武術; pinyin: wǔshù) and popularly as kung fu (Chinese: 功夫; pinyin: gōngfu), are a number of fighting styles that have developed over the centuries in China(Dell N3010 battery). These fighting styles are often classified according to common traits, identified as "families" (家, jiā), "sects" (派, pài) or "schools" (門, mén) of martial arts. Examples of such traits include physical exercises involving animal mimicry, or training methods inspired by Chinese philosophies, religions and legends. Styles which focus on qi manipulation are labeled as internal(Dell Inspiron N4010 battery) (内家拳, nèijiāquán), while others concentrate on improving muscle and cardiovascular fitness and are labeled external (外家拳, wàijiāquán). Geographical association, as in northern (北拳, běiquán) and southern (南拳, nánquán), is another popular method of categorization.
Terminology
Kung-fu and wushu are terms that have been borrowed into English to refer to Chinese martial arts(Dell INSPIRON 1100 battery). However, the Chinese terms kung fu and wushu listen (Mandarin) (help·info); Cantonese: móuh-seuht) have distinct meanings;[1] the Chinese literal equivalent of "Chinese martial art" would be Zhongguo wushu (Chinese: 中國武術; pinyin: zhōngguó wǔshù).
Wǔshù literally means "martial art". It is formed from the two words 武術: 武 (wǔ), meaning "martial" or "military" and 術 (shù), which translates into "discipline", "skill" or "method." (Dell Inspiron 1200 battery)
The term wushu has also become the name for the modern sport of wushu, an exhibition and full-contact sport of bare-handed and weapons forms (Chinese: 套路, pinyin: tàolù), adapted and judged to a set of aesthetic criteria for points developed since 1949 in the People's Republic of China. (Dell Inspiron 1420 battery)
[edit]The term "kung fu"
In Chinese, kung fu can also be used in contexts completely unrelated to martial arts, and refers colloquially to any individual accomplishment or skill cultivated through long and hard work.[1] Wushu is a more precise term for general martial activities(Dell Inspiron 1464 battery).
[edit]History
The genesis of Chinese martial arts has been attributed to the need for self-defense, hunting techniques and military training in ancient China. Hand-to-hand combat and weapons practice were important in training ancient Chinese soldiers(Dell Inspiron 1564 battery).[4][5]
While it is clear that various forms of martial arts have been practiced in China since antiquity, very little detail on specifics can be recovered for times predating the 16th century. By contrast, there is a variety of sources on the topic from the Qing period (1644 to 1912) (Dell Inspiron 1764 battery).
Detailed knowledge about the state and development of Chinese martial arts becomes available from the Nanjing decade (1928–1937), as the Central Guoshu Institute established by the Kuomintang regime made an effort to compile an encyclopedic survey of martial arts schools. Since the 1950s, the People's Republic of China has organized Chinese martial arts as an exhibition and full-contact sport under the heading of Wushu(Dell Inspiron 1520 battery).
[edit]Legendary origins
According to legend, Chinese martial arts originated during the semi-mythical Xia Dynasty (夏朝) more than 4,000 years ago.[6] It is said the Yellow Emperor Huangdi (legendary date of ascension 2698 BCE) introduced the earliest fighting systems to China.[7] The Yellow Emperor is described as a famous general who(Dell Inspiron 1521 battery), before becoming China’s leader, wrote lengthy treatises on medicine, astrology and the martial arts. One of his main opponents was Chi You (蚩尤) who was credited as the creator of jiao di, a forerunner to the modern art of Chinese Wrestling.[8]
[edit]Early history
The earliest references to Chinese martial arts are found in the Spring and Autumn Annals (5th century BCE),[9] where a hand to hand combat theory, including the integration of notions of "hard" and "soft" techniques, is mentioned(Dell inspiron 1525 battery).[10] A combat wrestling system called juélì or jiǎolì (角力) is mentioned in the Classic of Rites (1st century BCE).[11] This combat system included techniques such as strikes, throws, joint manipulation, and pressure point attacks. Jiao Di became a sport during the Qin Dynasty (221–207 BCE). The Han History Bibliographies record that, by the Former Han (206 BCE – 8 CE) (Dell inspiron 1526 battery), there was a distinction between no-holds-barred weaponless fighting, which it calls shǒubó (手搏), for which "how-to" manuals had already been written, and sportive wrestling, then known as juélì (角力). Wrestling is also documented in the Shǐ Jì, Records of the Grand Historian, written by Sima Qian (ca. 100 BCE) (Dell Inspiron 1720 battery).[12]
In the Tang Dynasty, descriptions of sword dances were immortalized in poems by Li Bai. In the Song and Yuan dynasties, xiangpu contests were sponsored by the imperial courts. The modern concepts of wushu were fully developed by the Ming and Qing dynasties.[13]
[edit]Philosophical influences(Dell Inspiron 2000 battery)
The ideas associated with Chinese martial arts changed with the evolution of Chinese society and over time acquired some philosophical bases: Passages in the Zhuangzi (庄子), a Daoist text, pertain to the psychology and practice of martial arts. Zhuangzi, its eponymous author, is believed to have lived in the 4th century BCE. The Tao Te Ching, often credited to Lao Zi(Dell INSPIRON 2600 battery), is another Daoist text that contains principles applicable to martial arts. According to one of the classic texts of Confucianism, Zhou Li (周禮/周礼), Archery and charioteering were part of the "six arts" (simplified Chinese: 六艺; traditional Chinese: 六藝; pinyin: liu yi, including rites, music, calligraphy and mathematics) of the Zhou Dynasty (1122–256 BCE) (Dell INSPIRON 3800 battery). The Art of War (孫子兵法), written during the 6th century BCE by Sun Tzu (孫子), deals directly with military warfare but contains ideas that are used in the Chinese martial arts.
Daoist practitioners have been practicing Tao Yin (physical exercises similar to Qigong that was one of the progenitors to T'ai chi ch'uan) from as early as 500 BCE.[14] In 39–92 CE, "Six Chapters of Hand Fighting", were included in the Han Shu (history of the Former Han Dynasty) written by Pan Ku. Also(Dell INSPIRON 4000 battery), the noted physician, Hua Tuo, composed the "Five Animals Play"—tiger, deer, monkey, bear, and bird, around 220 BCE.[15] Daoist philosophy and their approach to health and exercise have influenced the Chinese martial arts to a certain extent. Direct reference to Daoist concepts can be found in such styles as the "Eight Immortals" which uses fighting techniques that are attributed to the characteristics of each immortal(Dell Inspiron 5000 battery).[16]
[edit]Shaolin and temple-based martial arts
Main article: Shaolin Monastery
The Shaolin style of wushu is regarded as amongst the first institutionalized Chinese martial arts.[17] The oldest evidence of Shaolin participation in combat is a stele from 728 CE that attests to two occasions: a defense of the Shaolin Monastery from bandits around 610 CE, and their subsequent role in the defeat of Wang Shichong at the Battle of Hulao in 621 CE(Dell INSPIRON 500M battery). From the 8th to the 15th centuries, there are no extant documents that provide evidence of Shaolin participation in combat.
Between the 16th and 17th centuries, no fewer than forty sources exist to provide evidence both that monks of Shaolin practiced martial arts, and that martial practice became an integral element of Shaolin monastic life(Dell INSPIRON 5100 battery). For monks to justify it by creating new Buddhist lore, the earliest appearance of the frequently cited legend concerns Bodhidharma's supposed foundation of Shaolin Kung Fu dates to this period.[18] The origin of this legend has been traced to the Ming period's Yijin Jing or "Muscle Change Classic", a text written in 1624 attributed to Bodhidharma(Dell INSPIRON 510M battery).
Depiction of fighting monks demonstrating their skills to visiting dignitaries (early 19th-century mural in the Shaolin Monastery).
References of martial arts practice in Shaolin appear in various literary genres of the late Ming: the epitaphs of Shaolin warrior monks, martial-arts manuals, military encyclopedias, historical writings, travelogues, fiction and poetry. However these sources do not point out to any specific style originated in Shaolin(Dell INSPIRON 6000 battery).[19] These sources, in contrast to those from the Tang period, refer to Shaolin methods of armed combat. This include a skill for which Shaolin monks had become famous—the staff (gùn, Cantonese gwan). The Ming General Qi Jiguang included description of Shaolin Quan Fa (Pinyin romanization: Shào Lín Quán Fǎ or Wade-Giles romanization Shao Lin Ch'üan Fa, 少 林 拳 法 "fist principles"(Dell INSPIRON 600M battery); Japanese pronunciation: Shorin Kempo or Kenpo) and staff techniques in his book, Ji Xiao Xin Shu (紀效新書), which can be translated as "New Book Recording Effective Techniques". When this book spread to East Asia, it had a great influence on the development of martial arts in regions such as Okinawa [20] and Korea(Dell Inspiron 6400 battery).[21]
[edit]Modern history
Further information: Modern history of East Asian martial arts
[edit]Republican period
Most fighting styles that are being practiced as traditional Chinese martial arts today reached their popularity within the 20th century. Some of these include Bagua, Drunken Boxing, Eagle Claw, Five Animals, Hsing I, Hung Gar, Monkey, Bak Mei Pai, Praying Mantis, Fujian White Crane, Jow Ga(Dell INSPIRON 7000 battery), Wing Chun and T'ai chi ch'uan. The increase in the popularity of those styles is a result of the dramatic changes occurring within the Chinese society.
In 1900–01, the Righteous and Harmonious Fists rose against foreign occupiers and Christian missionaries in China. This uprising is known in the West as the Boxer Rebellion due to the martial arts and calisthenics practiced by the rebels. Though it originally opposed the Manchu Qing Dynasty(Dell INSPIRON 700M battery), the Empress Dowager Cixi gained control of the rebellion and tried to use it against the foreign powers. The failure of the rebellion led ten years later to the fall of the Qing Dynasty and the creation of the Chinese Republic.
The present view of Chinese martial arts are strongly influenced by the events of the Republican Period (1912–1949). In the transition period between the fall of the Qing Dynasty as well as the turmoils of the Japanese invasion and the Chinese Civil War(Dell Inspiron 710m battery), Chinese martial arts became more accessible to the general public as many martial artists were encouraged to openly teach their art. At that time, some considered martial arts as a means to promote national pride and build a strong nation. As a result, many training manuals (拳谱) were published, a training academy was created(Dell INSPIRON 8200 battery), two national examinations were organized as well as demonstration teams travelled overseas,[22] and numerous martial arts associations were formed throughout China and in various overseas Chinese communities. The Central Guoshu Academy (Zhongyang Guoshuguan, 中央國術館/中央国术馆) established by the National Government in 1928[23] and the Jing Wu Athletic Association (精武體育會/精武体育会) (Dell INSPIRON 8600 battery) founded by Huo Yuanjia in 1910 are examples of organizations that promoted a systematic approach for training in Chinese martial arts.[24][25][26] A series of provincial and national competitions were organized by the Republican government starting in 1932 to promote Chinese martial arts. In 1936, at the 11th Olympic Games in Berlin, a group of Chinese martial artists demonstrated their art to an international audience for the first time(Dell INSPIRON 9100 battery).
The term Kuoshu (or Guoshu, 國術 meaning "national art"), rather than the colloquial term gongfu was introduced by the Kuomintang in an effort to more closely associate Chinese martial arts with national pride rather than individual accomplishment.
[edit]People's Republic
Further information: Wushu (sport) and International Wushu Federation
Chinese martial arts experienced rapid international dissemination with the end of the Chinese Civil War and the founding of the People's Republic of China on October 1, 1949(Dell INSPIRON 9200 battery). Many well known martial artists chose to escape from the PRC's rule and migrate to Taiwan, Hong Kong,[27] and other parts of the world. Those masters started to teach within the overseas Chinese communities but eventually they expanded their teachings to include people from other ethnic groups(Dell INSPIRON 9300 battery).
Within China, the practice of traditional martial arts was discouraged during the turbulent years of the Chinese Cultural Revolution (1969–1976).[3] Like many other aspects of traditional Chinese life, martial arts were subjected to a radical transformation by the People's Republic of China in order to align them with Maoist revolutionary doctrine(Dell Inspiron 9400 battery).[3] The PRC promoted the committee-regulated sport of Wushu as a replacement to independent schools of martial arts. This new competition sport was disassociated from what was seen as the potentially subversive self-defense aspects and family lineages of Chinese martial arts.[3]
In 1958, the government established the All-China Wushu Association as an umbrella organization to regulate martial arts training(Dell Inspiron E1505 battery). The Chinese State Commission for Physical Culture and Sports took the lead in creating standardized forms for most of the major arts. During this period, a national Wushu system that included standard forms, teaching curriculum, and instructor grading was established. Wushu was introduced at both the high school and university level(Dell Inspiron E1705 battery). The suppression of traditional teaching was relaxed during the Era of Reconstruction (1976–1989), as Communist ideology became more accommodating to alternative viewpoints.[28] In 1979, the State Commission for Physical Culture and Sports created a special task force to reevaluate the teaching and practice of Wushu. In 1986(Dell Inspiron Mini 9 battery), the Chinese National Research Institute of Wushu was established as the central authority for the research and administration of Wushu activities in the People's Republic of China.[29]
Changing government policies and attitudes towards sports in general lead to the closing of the State Sports Commission (the central sports authority) in 1998. This closure is viewed as an attempt to partially de-politicize organized sports and move Chinese sport policies towards a more market-driven approach(Dell Latitude D400 battery).[30] As a result of these changing sociological factors within China, both traditional styles and modern Wushu approaches are being promoted by the Chinese government.[31]
Chinese martial arts are an integral element of 20th-century Chinese popular culture.[32] Wuxia or "martial arts fiction" is a popular genre which emerged in the early 20th century and peaked in popularity during the 1960s to 1980s(Dell STUDIO 1450 battery). Wuxia films were produced from the 1920s. The Kuonmintang suppressed wuxia, accusing it of promoting superstition and violent anarchy. Because of this, wuxia came to flourish in British Hong Kong, and the genre of kung fu movie in Hong Kong action cinema became wildly popular, coming to international attention from the 1970s(Dell Vostro 1400 battery). The genre declined somewhat during the 1980s, and in the late 1980s the Hong Kong film industry underwent a drastic decline, even before Hong Kong was handed to the People's Republic in 1997. In the wake of Crouching Tiger, Hidden Dragon (2000), there has been somewhat of a revival of Chinese-produced wuxia films aimed at an international audience(Dell Vostro 1500 battery), including Hero (2002), House of Flying Daggers (2004) and Reign of Assassins (2010).
[edit]Styles
Main article: Styles of Chinese martial arts
See also: List of Chinese martial arts
The Yang style of taijiquan being practiced on the Bund in Shanghai
China has a long history of martial traditions that includes hundreds of different styles. Over the past two thousand years many distinctive styles have been developed, each with its own set of techniques and ideas. There are also common themes to the different styles, which are often classified by "families" (家, jiā), "sects" (派, pai) or "schools" (門, men) (Dell XPS GEN 2 battery). There are styles that mimic movements from animals and others that gather inspiration from various Chinese philosophies, myths and legends. Some styles put most of their focus into the harnessing of qi, while others concentrate on competition.
Chinese martial arts can be split into various categories to differentiate them: For example, external (外家拳) and internal (内家拳) (Dell XPS M1210 battery).[34] Chinese martial arts can also be categorized by location, as in northern (北拳) and southern (南拳) as well, referring to what part of China the styles originated from, separated by the Yangtze River (Chang Jiang); Chinese martial arts may even be classified according to their province or city(Dell XPS M1330 battery).[22] The main perceived difference between northern and southern styles is that the northern styles tend to emphasize fast and powerful kicks, high jumps and generally fluid and rapid movement, while the southern styles focus more on strong arm and hand techniques, and stable, immovable stances and fast footwork(Dell XPS 1340 battery). Examples of the northern styles include changquan and xingyiquan. Examples of the southern styles include Bak Mei, Wuzuquan, Choy Li Fut and Wing Chun. Chinese martial arts can also be divided according to religion, imitative-styles (象形拳), and family styles such as Hung Gar (洪家). There are distinctive differences in the training between different groups of the Chinese martial arts regardless of the type of classification(Dell XPS M1530 battery). However, few experienced martial artists make a clear distinction between internal and external styles, or subscribe to the idea of northern systems being predominantly kick-based and southern systems relying more heavily on upper-body techniques. Most styles contain both hard and soft elements, regardless of their internal nomenclature(Dell XPS M170 battery). Analyzing the difference in accordance with yin and yang principles, philosophers would assert that the absence of either one would render the practitioner's skills unbalanced or deficient, as yin and yang alone are each only half of a whole. If such differences did once exist, they have since been blurred(Dell XPS M1710 battery).
[edit]Training
Chinese martial arts training consists of the following components: basics, forms, applications and weapons; different styles place varying emphasis on each component.[35] In addition, philosophy, ethics and even medical practice[36] are highly regarded by most Chinese martial arts. A complete training system should also provide insight into Chinese attitudes and culture(Dell XPS M1730 battery).[37]
[edit]Basics
The Basics (基本功) are a vital part of any martial training, as a student cannot progress to the more advanced stages without them; Basics are usually made up of rudimentary techniques, conditioning exercises, including stances. Basic training may involve simple movements that are performed repeatedly(Dell XPS M2010 battery); other examples of basic training are stretching, meditation, striking, throwing, or jumping. Without strong and flexible muscles, management of Qi or breath, and proper body mechanics, it is impossible for a student to progress in the Chinese martial arts.[38][39] A common saying concerning basic training in Chinese martial arts is as follows(Dell Latitude E5400 battery):[40]
内外相合,外重手眼身法步,内修心神意氣力。
Which can be translated as:
Train both Internal and External.
External training includes the hands, the eyes, the body and stances.
Internal training includes the heart, the spirit, the mind, breathing and strength.
[edit]Stances
Stances (steps or 步法) are structural postures employed in Chinese martial arts training.[41][42] They represent the foundation and the form of a fighter's base. Each style has different names and variations for each stance(Dell Latitude E5500 battery). Stances may be differentiated by foot position, weight distribution, body alignment, etc. Stance training can be practiced statically, the goal of which is to maintain the structure of the stance through a set time period, or dynamically, in which case a series of movements is performed repeatedly. The horse-riding stance (Dell Latitude E6400 battery) (骑马步/马步 qí mǎ bù/mǎ bù) and the bow stance are examples of stances found in many styles of Chinese martial arts.
[edit]Meditation
In many Chinese martial arts, meditation is considered to be an important component of basic training. Meditation can be used to develop focus, mental clarity and can act as a basis for qigong training. (Dell Latitude E6500 battery)
[edit]Use of qi
Main article: Qigong
The concept of qi or ch'i (氣/气) is encountered in a number of Chinese martial arts. Qi is variously defined as an inner energy or "life force" that is said to animate living beings; as a term for proper skeletal alignment and efficient use of musculature (sometimes also known as fa jin or jin) (Dell Inspiron Mini 12 battery); or as a shorthand for concepts that the martial arts student might not yet be ready to understand in full. These meanings are not necessarily mutually exclusive.[note 1] The existence of qi as a measurable form of energy as discussed in traditional Chinese medicine has no basis in the scientific understanding of physics, medicine, biology or human physiology(Dell XPS M140 battery).[45]
There are many ideas regarding the control of one's qi energy to such an extent that it can be used for healing oneself or others: the goal of medical qigong. Some styles believe in focusing qi into a single point when attacking and aim at specific areas of the human body. Such techniques are known as dim mak and have principles that are similar to acupressure(Dell XPS 13 battery).[46]
[edit]Weapons training
Further information: Chinese swordsmanship
Most Chinese styles also make use of training in the broad arsenal of Chinese weapons for conditioning the body as well as coordination and strategy drills.[47] Weapons training (qìxiè 器械) are generally carried out after the student is proficient in the basics, forms and applications training(Dell XPS 16 battery). The basic theory for weapons training is to consider the weapon as an extension of the body. It has the same requirements for footwork and body coordination as the basics.[48] The process of weapon training proceeds with forms, forms with partners and then applications. Most systems have training methods for each of the Eighteen Arms of Wushu (Dell XPS 1640 battery) (shíbābānbīngqì 十八般兵器) in addition to specialized instruments specific to the system.
[edit]Application
Main article: Lei tai
See also: Sanshou and Shuai jiao
Application refers to the practical use of combative techniques. Chinese martial arts techniques are ideally based on efficiency and effectiveness.[49][50] Application includes non-compliant drills, such as Pushing Hands in many internal martial arts, and sparring, which occurs within a variety of contact levels and rule sets(Dell XPS 1645 battery).
When and how applications are taught varies from style to style. Today, many styles begin to teach new students by focusing on exercises in which each student knows a prescribed range of combat and technique to be drilled; these drills are often semi-compliant(Dell XPS 1647 battery), meaning one student does not offer active resistance to a technique in order to allow its demonstrative, clean execution. In more resisting drills, fewer rules are applied and students practice how to react and respond. 'Sparring' refers to the most important aspect of application training, which simulates a combat situation while including rules and regulations in order to reduce the chance of serious injury to the students(Dell Latitude 131L battery).
Competitive sparring disciplines include Chinese kickboxing Sǎnshǒu(散手) and Chinese folk wrestling Shuāijiāo(摔跤), which were traditionally contested on a raised platform arena Lèitái(擂台).[51] Lèitái represents public challenge matches that first appeared in the Song Dynasty. The objective for those contests was to knock the opponent from a raised platform by any means necessary(Dell Latitude C400 battery). San Shou represents the modern development of Lei Tai contests, but with rules in place to reduce the chance of serious injury. Many Chinese martial art schools teach or work within the rule sets of Sanshou, working to incorporate the movements, characteristics, and theory of their style.[52] Chinese martial artists also compete in non-Chinese or mixed Combat sport, including boxing, kickboxing and Mixed martial arts(Dell Latitude C500 battery).
[edit]Forms
Further information: form (martial arts)
Forms or taolu (Chinese: 套路; pinyin: tào lù) in Chinese are series of predetermined movements combined so they can be practiced as one linear set of movements. Forms were originally intended to preserve the lineage of a particular style branch, and were often taught to advanced students who were selected to preserve the art's lineage(Dell Latitude C510 battery). Forms were designed to contain both literal, representative and exercise-oriented forms of applicable techniques which would be extracted, tested and trained by students through sparring sessions.[53]
Today, many consider forms to be one of the most important practices in Chinese martial arts. Traditionally, they played a smaller role in training combat application, and were eclipsed by sparring, drilling and conditioning(Dell Latitude C540 battery). Forms gradually build up a practitioner's flexibility, internal and external strength, speed and stamina, and teach balance and coordination. Many styles contain forms using a wide range of weapons of various length and type, utilizing one or two hands. There are also styles which focus on a certain type of weapon. Forms are meant to be both practical(Dell Latitude C600 battery), usable, and applicable as well as promoting flow, meditation, flexibility, balance and coordination. Teachers are often heard to say "train your form as if you were sparring and spar as if it were a form."
There are two general types of forms in Chinese martial arts. Most common are "solo forms" which are performed by a single student. There are also "sparring" forms, which are choreographed fighting sets performed by two or more people(Dell Latitude C610 battery). Sparring forms were designed both to acquaint beginning fighters with basic measures and concepts of combat, and to serve as performance pieces for the school. Sparring forms which utilize weapons are especially useful for teaching students the extension, range and technique required to manage a weapon(Dell Latitude C640 battery).
[edit]Forms in Traditional Chinese Martial Arts
The term “taolu (套路)” is a shorten version of “Tao Lu Yun Dong (套路运动)”; an expression that was introduced only recently with the popularity modern wushu. This expression refers to “exercise sets” and is used in the context of athletics or sport.
In contrast, in traditional Chinese martial arts alternative terminologies for the training (練) of 'sets or forms are(Dell Latitude C800 battery):
lian quan tao (練拳套) – practicing sequence of fist;
lian quan jiao (練拳腳) – practicing fists and feet;
lian bing qi (練兵器) – practicing weapons;
dui da (對打) and dui lian (對練) – fighting sets.
Traditional "sparring" sets, called dui da, 對打 or, dui lian, 對練, were an important part of Chinese martial arts for centuries. Dui lian (對練), literally means, to train by a pair of combatants opposing each other (Dell Latitude C810 battery) (the character l練, means to practice; to train; to perfect one's skill; to drill). As well, often one of these terms are also included in the name of fighting sets: 雙演, shuang yan, 'paired practice'; 掙勝, zheng sheng, 'to struggle with strength for victory'; 敵, di, ' match – the character suggests to strike an enemy; and 破, po, 'to break'(Dell Latitude C840 battery).
Generally there are 21, 18, 12, 9 or 5 drills or 'exchanges/groupings' of attacks and counter attacks, in each dui lian, 對 練 set. These drills were considered only generic patterns and never meant to be considered inflexible 'tricks'. Students practiced smaller parts/exchanges, individually with opponents switching sides in a continuous flow(Dell Latitude CPI battery). Basically, dui lian were not only a sophisticated and effective methods of passing on the fighting knowledge of the older generation, they were important and effective training methods. The relationship between single sets and contact sets is quite complicated in that in many cases there are skills which simply can not be developed with single sets(Dell Latitude CPX battery), and, conversely, with dui lian. Unfortunately, it appears that most traditional combat oriented dui lian and their training methodology have disappeared, especially those concerning weapons. There are a number of reasons for this. In modern Chinese martial arts most of the dui lian are recent inventions designed for light props resembling weapons(Dell Latitude D410 battery), with safety and drama in mind. The role of this kind of training has degenerated to the point of being useless in a practical sense, and, at best, is just performance.
By the early Song period, sets were not so much "individual isolated technique strung together" but rather were composed of techniques and counter technique groupings. It is quite clear that "sets" and "fighting (2 person) sets" have been instrumental in TCM for many hundreds of years —even before the Song Dynasty(Dell Latitude D420 battery). There are images of two person weapon training in Chinese stone painting going back at least to the Eastern Han Dynasty.
According to what has been passed on by the older generations, the approximate ratio of contact sets to single sets was approximately 1:3. In other words, about 30% of the sets practiced at Shaolin were contact sets, dui lian, 對 練, and two person drill training. This is, in part, evidenced by the Qing Dynasty mural at Shaolin(Dell Latitude D430 battery).
Ancient literature from the Tang and Northern Song Dynasties suggests that some sets, including those which required two or more participants, became very elaborate, "flowery", and mainly concerned with aesthetics. During this time, some martial arts systems devolved to the point that they (Dell Latitude D500 battery)became popular forms of martial art storytelling entertainment shows. This created an entire new category of martial arts known as Hua Fa Wuyi , 花法武藝, or "fancy patterns for developing military skill". During the Northern Song period it was noted by historians that this phenomenon had a negative influence on training in the military(Dell Latitude D505 battery).
For most of its history, Shaolin martial arts was largely weapon-focused: staves were used to defend the monastery, not bare hands. Even the more recent military exploits of Shaolin during the Ming and Qing Dynasties involved weapons. According to some traditions, monks first studied basics for one year and were then taught staff fighting so that they could protect the monastery(Dell Latitude D510 battery). Although wrestling has been as sport in China for centuries, weapons have been the most important part of Chinese wushu since ancient times. If one wants to talk about recent or 'modern' developments in Chinese martial arts (including Shaolin for that matter), it is the over-emphasis on bare hand fighting. During the Northern Song Dynasty (976- 997 A.D) (Dell Latitude D520 battery) when platform fighting known as Da Laitai (Title Fights Challenge on Platform) first appeared, these fights were with only swords and staves. Although later, when bare hand fights appeared as well, it was the weapons events that became the most famous. These open-ring competitions had regulations and were organized by government organizations(Dell Latitude D600 battery); some were also organized by the public. The government competitions resulted in appointments to military posts for winners and were held in the capital as well as in the prefectures.
[edit]Controversy
Even though forms in Chinese martial arts are intended to depict realistic martial techniques, the movements are not always identical to how techniques would be applied in combat. Many forms have been elaborated upon(Dell Latitude D610 battery), on the one hand to provide better combat preparedness, and on the other hand to look more aesthetically pleasing. One manifestation of this tendency toward elaboration which goes beyond combat application is the use of lower stances and higher, stretching kicks. These two maneuvers are unrealistic in combat and are utilized in forms for exercise purposes(Dell Latitude D620 battery).[54] Many modern schools have replaced practical defense or offense movements with acrobatic feats that are more spectacular to watch, thereby gaining favor during exhibitions and competitions.[note 2] This has led to criticisms by traditionalists of the endorsement of the more acrobatic, show-oriented Wushu competition(Dell Latitude D630 battery).[55] Even though appearance has always been important in many traditional forms as well, all patterns exist for their combat functionality. Historically forms were often performed for entertainment purposes long before the advent of modern Wushu as practitioners have looked for supplementary income by performing on the streets or in theaters. As documented in ancient literature during the Tang Dynasty (Dell Latitude D800 battery) (618–907) and the Song Dynasty (960–1279) suggest some sets, (including two + person sets: dui da, 對打 also called dui lian, 對 練) became very elaborate and 'flowery', many mainly concerned with esthetics. During this time, some martial arts systems de-evolved to the point that they became popular forms of martial art storytelling entertainment shows. This created an entire category of martial arts known as Hua Fa Wuyi (Dell Latitude D810 battery), 花法武藝 – fancy patterns for developing military skill. During the Northern Song period, it was noted by historians this type of training had a negative influence on training in the military.
Many traditional Chinese martial artists, as well as practitioners of modern sport combat, have become critical of the perception that forms work is more relevant to the art than sparring and drill application(Dell Latitude D820 battery), while most continue to see traditional forms practice within the traditional context—as vital to both proper combat execution, the Shaolin aesthetic as art form, as well as upholding the meditative function of the physical art form.[56]
Another reason why techniques often appear different in forms when contrasted with sparring application is thought by some to come from the concealment of the actual functions of the techniques from outsiders(Dell Latitude D830 battery).[57]
[edit]Wushu
Modern forms are used in the sport of wushu, as seen in this staff routine
See also: Wushu (sport)
“‘Wu’ 武” is translated as ‘martial’ in English, however in terms of etymology, this word has a slightly different meaning. In Chinese, “wu 武” is made up of two parts, the first meaning “stop”(zhi 止) and the second meaning “invaders lance” (je 戈). This implies that “wu’ 武,” is a defensive use of combat(Dell Latitude 2100 battery). The term “wushu 武術” meaning martial arts goes back only to the beginning of the 20th century. Prior to that it meant military affairs. The earliest term found in the Han History (206BC-23AD) was "bing jiqiao" 兵技巧,military fighting techniques. During the Song period (c960) the name changed to "wuyi" 武艺,literally "martial arts"(Dell Latitude 2110 battery). In 1928 the name was changed to "guoshu" 国术 or "national arts" when the National Martial Arts Academy was established in Nanjing. The term reverted to "wushu" 武術 under the People's Republic of China during the early 1950s.
As forms have grown in complexity and quantity over the years, and many forms alone could be practiced for a lifetime(Dell Latitude E4300 battery), modern styles of Chinese martial arts have developed that concentrate solely on forms, and do not practice application at all. These styles are primarily aimed at exhibition and competition, and often include more acrobatic jumps and movements added for enhanced visual effect[58] compared to the traditional styles(Dell Vostro 1310 battery). Those who generally prefer to practice traditional styles, focused less on exhibition, are often referred to as traditionalists. Some traditionalists consider the competition forms of today's Chinese martial arts as too commercialized and losing much of its original values(Dell Vostro 1320 battery).[59][60]
[edit]"Martial Morality"
Traditional Chinese schools of martial arts, such as the famed Shaolin monks, often dealt with the study of martial arts not just as a means of self-defense or mental training, but as a system of ethics.[37][61] Wude (武 德) can be translated as "martial morality" and is constructed from the words "wu" (武), which means martial, and "de" (德), which means morality. Wude (武德) deals with two aspects(Dell Vostro 1510 battery); "morality of deed" and "morality of mind". Morality of deed concerns social relations; morality of mind is meant to cultivate the inner harmony between the emotional mind (Xin, 心) and the wisdom mind (Hui, 慧). The ultimate goal is reaching "no extremity" (Wuji, 無 極) (closely related to the Taoist concept of wu wei), where both wisdom and emotions are in harmony with each other(Dell Vostro 1520 battery).
Notable practitioners
See also: Category: Chinese martial artists and Category: Wushu practitioners
Examples of well-known practitioners (武术名师) throughout history:
Yue Fei (1103–1142 CE) was a famous Chinese general and patriot of the Song Dynasty. Styles such as Eagle Claw and Xingyi attribute their creation to Yue. However, there is no historical evidence to support the claim he created these styles(Dell Vostro 2510 battery).
Ng Mui (late 17th century) was the legendary female founder of many Southern martial arts such as Wing Chun Kuen, Dragon style and Fujian White Crane. She is often considered one of the legendary Five Elders who survived the destruction of the Shaolin Temple during the Qing Dynasty(Dell Vostro 1014 battery).
Yang Luchan (1799–1872) was an important teacher of the internal martial art known as t'ai chi ch'uan in Beijing during the second half of the 19th century. Yang is known as the founder of Yang-style t'ai chi ch'uan, as well as transmitting the art to the Wu/Hao, Wu and Sun t'ai chi families(Dell Inspiron 1410 battery).
Ten Tigers of Canton (late 19th century) was a group of ten of the top Chinese martial arts masters in Guangdong (Canton) towards the end of the Qing Dynasty (1644–1912). Wong Kei-Ying, Wong Fei Hung's father, was a member of this group.
Wong Fei Hung (1847–1924) was considered a Chinese folk hero during the Republican period. More than one hundred Hong Kong movies were made about his life. Sammo Hung, Jackie Chan, and Jet Li have all portrayed his character in blockbuster pictures(Dell Vostro 1014N battery).
Huo Yuanjia (1867–1910) was the founder of Chin Woo Athletic Association who was known for his highly publicized matches with foreigners. His biography was recently portrayed in the movie Fearless (2006).
Yip Man (1893–1972) was a master of the Wing Chun and the first to teach this style openly. Yip Man was the teacher of Bruce Lee. Most major branches of Wing Chun that exist today were developed and promoted by students of Yip Man(Dell Vostro 1015 battery).
Bruce Lee (1940–1973) was a Chinese American martial artist and actor who was considered an important icon in the 20th century.[62] He practiced Wing Chun and made it famous. Using Wing Chun as his base and learning from the influences of other martial arts his experience exposed him to, he later developed his own martial arts philosophy which evolved into what is now known as Jeet Kune Do(Dell Vostro 1015N battery).
Jackie Chan (b. 1954) is a Chinese martial artist and actor widely known for injecting physical comedy into his martial arts performances, and for performing complex stunts in many of his films.
Jet Li (b. 1963) is the five-time sport wushu champion of China, later demonstrating his skills in cinema.
Donnie Yen (b. 1963) is a Hong Kong actor, martial artist, film director and producer, action choreographer, and world wushu tournament medalist(Dell Inspiron 1088 battery).
[edit]Popular culture
References to the concepts and use of Chinese martial arts can be found in popular culture. Historically, the influence of Chinese martial arts can be found in books and in the performance arts specific to Asia[citation needed]. Recently, those influences have extended to the movies and television that targets a much wider audience(Dell Inspiron 1088N battery). As a result, Chinese martial arts have spread beyond its ethnic roots and have a global appeal.[63][64]
Martial arts play a prominent role in the literature genre known as wuxia (武俠小說). This type of fiction is based on Chinese concepts of chivalry, a separate martial arts society (Wulin, 武林) and a central theme involving martial arts.[65] Wuxia stories can be traced as far back as 2nd and 3rd century BCE(Dell Vostro A840 battery), becoming popular by the Tang Dynasty and evolving into novel form by the Ming Dynasty. This genre is still extremely popular in much of Asia[citation needed] and provides a major influence for the public perception of the martial arts.
Martial arts influences can also be found in Chinese opera of which Beijing opera is one of the best-known examples. This popular form of drama dates back to the Tang Dynasty and continues to be an example of Chinese culture(Dell Vostro A860 battery). Some martial arts movements can be found in Chinese opera and some martial artists can be found as performers in Chinese operas.
In modern times, Chinese martial arts have spawned the genre of cinema known as the martial arts film. The films of Bruce Lee were instrumental in the initial burst of Chinese martial arts' popularity in the West in the 1970s(Dell Vostro A860N battery).[66]
Martial artists and actors such as Jet Li and Jackie Chan have continued the appeal of movies of this genre. Martial arts films from China are often referred to as "kungfu movies" (功夫片), or "wire-fu" if extensive wire work is performed for special effects, and are still best known as part of the tradition of kungfu theater. (see also: wuxia, Hong Kong action cinema) (Dell Inspiron Mini 1012 battery).
In the west, Kung fu has become a regular action staple, and makes appearances in many films that would not generally be considered "Martial Arts" films. These films include but are not limited to The Matrix Trilogy, Kill Bill, and The Transporter.
Martial arts themes can also be found on television networks. A U.S. network TV western series of the early 1970s called Kung Fu also served to popularize the Chinese martial arts on television(SONY PCG-5G2L battery). With 60 episodes over a three-year span, it was one of the first North American TV shows that tried to convey the philosophy and practice in Chinese martial arts.[67][68] The use of Chinese martial arts techniques can now be found in most TV action series, although the philosophy of Chinese martial arts is seldom portrayed in depth(SONY PCG-5G3L battery).
Friday, February 3, 2012
Thursday, February 2, 2012
Central processing unit
The central processing unit (CPU) is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer(Sony VGN-CR11SR Battery). The term has been in use in the computer industry at least since the early 1960s.[1] The form, design and implementation of CPUs have changed dramatically since the earliest examples, but their fundamental operation remains much the same.
On large machines, CPUs require one or more printed circuit boards. On personal computers and small workstations(Sony VGN-CR11Z Battery), the CPU is housed in a single silicon chip called a microprocessor. Since the 1970s the microprocessor class of CPUs has almost completely overtaken all other CPU implementations. Modern CPUs are large scale integrated circuits in packages typically less than four centimeters square, with hundreds of connecting pins(Sony VGN-CR11S Battery).
Two typical components of a CPU are the arithmetic logic unit (ALU), which performs arithmetic and logical operations, and the control unit (CU), which extracts instructions from memory and decodes and executes them, calling on the ALU when necessary.
Not all computational systems rely on a central processing unit. An array processor or vector processor has multiple parallel computing elements(Sony VGN-CR11M Battery), with no one unit considered the "center". In the distributed computing model, problems are solved by a distributed interconnected set of processors.
History
Main article: History of general purpose CPUs
EDVAC, one of the first stored program computers
Computers such as the ENIAC had to be physically rewired in order to perform different tasks, which caused these machines to be called "fixed-program computers." Since the term "CPU" is generally defined as a device for software (computer program) execution, the earliest devices that could rightly be called CPUs came with the advent of the stored-program computer(Sony VGN-CR11E Battery).
The idea of a stored-program computer was already present in the design of J. Presper Eckert and John William Mauchly's ENIAC, but was initially omitted so that it could be finished sooner. On June 30, 1945, before ENIAC was made, mathematician John von Neumann distributed the paper entitled First Draft of a Report on the EDVAC(Sony VGN-CR21E Battery). It was the outline of a stored-program computer that would eventually be completed in August 1949.[2] EDVAC was designed to perform a certain number of instructions (or operations) of various types. These instructions could be combined to create useful programs for the EDVAC to run. Significantly, the programs written for EDVAC were stored in high-speed computer memory rather than specified by the physical wiring of the computer(Sony VGN-CR21S Battery). This overcame a severe limitation of ENIAC, which was the considerable time and effort required to reconfigure the computer to perform a new task. With von Neumann's design, the program, or software, that EDVAC ran could be changed simply by changing the contents of the memory(Sony VGN-CR21Z Battery).
Early CPUs were custom-designed as a part of a larger, sometimes one-of-a-kind, computer. However, this method of designing custom CPUs for a particular application has largely given way to the development of mass-produced processors that are made for many purposes. (Sony VGN-CR21SR Battery) This standardization began in the era of discrete transistor mainframes and minicomputers and has rapidly accelerated with the popularization of the integrated circuit (IC). The IC has allowed increasingly complex CPUs to be designed and manufactured to tolerances on the order of nanometers. Both the miniaturization and standardization of CPUs have increased the presence of digital devices in modern life far(Sony VGN-CR31SR Battery) beyond the limited application of dedicated computing machines. Modern microprocessors appear in everything from automobiles to cell phones and children's toys.
While von Neumann is most often credited with the design of the stored-program computer because of his design of EDVAC, others before him, such as Konrad Zuse, had suggested and implemented similar ideas. The so-called Harvard architecture of the Harvard Mark I, which was completed before EDVAC(Sony VGN-CR31S Battery), also utilized a stored-program design using punched paper tape rather than electronic memory. The key difference between the von Neumann and Harvard architectures is that the latter separates the storage and treatment of CPU instructions and data, while the former uses the same memory space for both(Sony VGN-CR31E Battery). Most modern CPUs are primarily von Neumann in design, but elements of the Harvard architecture are commonly seen as well.
Relays and vacuum tubes (thermionic valves) were commonly used as switching elements; a useful computer requires thousands or tens of thousands of switching devices. The overall speed of a system is dependent on the speed of the switches(Sony VGN-CR31Z Battery). Tube computers like EDVAC tended to average eight hours between failures, whereas relay computers like the (slower, but earlier) Harvard Mark I failed very rarely.[1] In the end, tube based CPUs became dominant because the significant speed advantages afforded generally outweighed the reliability problems(Sony VGN-CR41Z Battery). Most of these early synchronous CPUs ran at low clock rates compared to modern microelectronic designs (see below for a discussion of clock rate). Clock signal frequencies ranging from 100 kHz to 4 MHz were very common at this time, limited largely by the speed of the switching devices they were built with(Sony VGN-CR41S Battery).
[edit]Control unit
Main article: Control unit
The control unit of the CPU contains circuitry that uses electrical signals to direct the entire computer system to carry out stored program instructions. The control unit does not execute program instructions; rather, it directs other parts of the system to do so. The control unit must communicate with both the arithmetic/logic unit and memory(Sony VGN-CR41E Battery).
[edit]Discrete transistor and integrated circuit CPUs
CPU, core memory, and external bus interface of a DEC PDP-8/I. Made of medium-scale integrated circuits
The design complexity of CPUs increased as various technologies facilitated building smaller and more reliable electronic devices. The first such improvement came with the advent of the transistor. Transistorized CPUs during the 1950s and 1960s no longer had to be built out of bulky(Sony VGN-CR41SR Battery), unreliable, and fragile switching elements like vacuum tubes and electrical relays. With this improvement more complex and reliable CPUs were built onto one or several printed circuit boards containing discrete (individual) components.
During this period, a method of manufacturing many transistors in a compact space gained popularity. The integrated circuit (IC) allowed a large number of transistors to be manufactured on a single semiconductor-based die(Sony VGN-CR42Z Battery), or "chip." At first only very basic non-specialized digital circuits such as NOR gates were miniaturized into ICs. CPUs based upon these "building block" ICs are generally referred to as "small-scale integration" (SSI) devices. SSI ICs, such as the ones used in the Apollo guidance computer, usually contained up to a few score transistors(Sony VGN-CR42S Battery). To build an entire CPU out of SSI ICs required thousands of individual chips, but still consumed much less space and power than earlier discrete transistor designs. As microelectronic technology advanced, an increasing number of transistors were placed on ICs, thus decreasing the quantity of individual ICs needed for a complete CPU. MSI and LSI (medium- and large-scale integration) (Sony VGN-CR42E Battery) ICs increased transistor counts to hundreds, and then thousands.
In 1964 IBM introduced its System/360 computer architecture which was used in a series of computers that could run the same programs with different speed and performance. This was significant at a time when most electronic computers were incompatible with one another, even those made by the same manufacturer(Sony Vaio VGN-CR11S/L Battery). To facilitate this improvement, IBM utilized the concept of a microprogram (often called "microcode"), which still sees widespread usage in modern CPUs.[3] The System/360 architecture was so popular that it dominated the mainframe computer market for decades and left a legacy that is still continued by similar modern computers like the IBM zSeries(Sony Vaio VGN-CR11S/P Battery). In the same year (1964), Digital Equipment Corporation (DEC) introduced another influential computer aimed at the scientific and research markets, the PDP-8. DEC would later introduce the extremely popular PDP-11 line that originally was built with SSI ICs but was eventually implemented with LSI components once these became practical(Sony Vaio VGN-CR11S/W Battery). In stark contrast with its SSI and MSI predecessors, the first LSI implementation of the PDP-11 contained a CPU composed of only four LSI integrated circuits.[4]
Transistor-based computers had several distinct advantages over their predecessors. Aside from facilitating increased reliability and lower power consumption, transistors also allowed CPUs to operate at much higher speeds because of the short switching time of a transistor in comparison to a tube or relay(Sony Vaio VGN-CR11Z/R Battery). Thanks to both the increased reliability as well as the dramatically increased speed of the switching elements (which were almost exclusively transistors by this time), CPU clock rates in the tens of megahertz were obtained during this period. Additionally while discrete transistor and IC CPUs were in heavy usage(Sony Vaio VGN-CR13/B Battery), new high-performance designs like SIMD (Single Instruction Multiple Data) vector processors began to appear. These early experimental designs later gave rise to the era of specialized supercomputers like those made by Cray Inc.
[edit]Microprocessors
This section does not cite any references or sources. Please help improve this section by adding citations to reliable sources. Unsourced material may be challenged and removed(Sony Vaio VGN-CR13/L Battery). (October 2009)
Main article: Microprocessor
Die of an Intel 80486DX2 microprocessor (actual size: 12×6.75 mm) in its packaging
Intel Core i5 CPU on a Vaio E series laptop motherboard (on the right, beneath the heat pipe).
In the 1970s the fundamental inventions by Federico Faggin (Silicon Gate MOS ICs with self aligned gates along with his new random logic design methodology) changed the design and implementation of CPUs forever(Sony Vaio VGN-CR13/P Battery). Since the introduction of the first commercially available microprocessor (the Intel 4004), in 1970 and the first widely used microprocessor (the Intel 8080) in 1974, this class of CPUs has almost completely overtaken all other central processing unit implementation methods. Mainframe and minicomputer manufacturers of the time launched proprietary IC development programs to upgrade their older computer architectures(Sony Vaio VGN-CR13/R Battery), and eventually produced instruction set compatible microprocessors that were backward-compatible with their older hardware and software. Combined with the advent and eventual vast success of the now ubiquitous personal computer, the term CPU is now applied almost exclusively to microprocessors. Several CPUs can be combined in a single processing chip(Sony Vaio VGN-CR13/W Battery).
Previous generations of CPUs were implemented as discrete components and numerous small integrated circuits (ICs) on one or more circuit boards. Microprocessors, on the other hand, are CPUs manufactured on a very small number of ICs; usually just one. The overall smaller CPU size as a result of being implemented on a single die means faster switching time because of physical factors like decreased gate parasitic capacitance(Sony Vaio VGN-CR13G Battery). This has allowed synchronous microprocessors to have clock rates ranging from tens of megahertz to several gigahertz. Additionally, as the ability to construct exceedingly small transistors on an IC has increased, the complexity and number of transistors in a single CPU has increased dramatically. This widely observed trend is described by Moore's law, which has proven to be a fairly accurate predictor of the growth of CPU (and other IC) complexity to date(Sony Vaio VGN-CR13G/B Battery).
While the complexity, size, construction, and general form of CPUs have changed drastically over the past sixty years, it is notable that the basic design and function has not changed much at all. Almost all common CPUs today can be very accurately described as von Neumann stored-program machines(Sony Vaio VGN-CR13G/L Battery). As the aforementioned Moore's law continues to hold true, concerns have arisen about the limits of integrated circuit transistor technology. Extreme miniaturization of electronic gates is causing the effects of phenomena like electromigration and subthreshold leakage to become much more significant. These newer concerns are among the many factors causing researchers to investigate new methods of computing such as the quantum computer(Sony Vaio VGN-CR13G/W Battery), as well as to expand the usage of parallelism and other methods that extend the usefulness of the classical von Neumann model.
[edit]Operation
The fundamental operation of most CPUs, regardless of the physical form they take, is to execute a sequence of stored instructions called a program. The program is represented by a series of numbers that are kept in some kind of computer memory. There are four steps that nearly all CPUs use in their operation: fetch, decode, execute, and writeback(Sony Vaio VGN-CR13G/P Battery).
The first step, fetch, involves retrieving an instruction (which is represented by a number or sequence of numbers) from program memory. The location in program memory is determined by a program counter (PC), which stores a number that identifies the current position in the program. After an instruction is fetched, the PC is incremented by the length of the instruction word in terms of memory units(Sony Vaio VGN-CR13G/R Battery).[5] Often, the instruction to be fetched must be retrieved from relatively slow memory, causing the CPU to stall while waiting for the instruction to be returned. This issue is largely addressed in modern processors by caches and pipeline architectures (see below).
The instruction that the CPU fetches from memory is used to determine what the CPU is to do. In the decode step(Sony Vaio VGN-CR13T/L Battery), the instruction is broken up into parts that have significance to other portions of the CPU. The way in which the numerical instruction value is interpreted is defined by the CPU's instruction set architecture (ISA).[6] Often, one group of numbers in the instruction, called the opcode, indicates which operation to perform(Sony Vaio VGN-CR13T/P Battery). The remaining parts of the number usually provide information required for that instruction, such as operands for an addition operation. Such operands may be given as a constant value (called an immediate value), or as a place to locate a value: a register or a memory address, as determined by some addressing mode. In older designs the portions of the CPU responsible for instruction decoding were unchangeable hardware devices(Sony Vaio VGN-CR13T/R Battery). However, in more abstract and complicated CPUs and ISAs, a microprogram is often used to assist in translating instructions into various configuration signals for the CPU. This microprogram is sometimes rewritable so that it can be modified to change the way the CPU decodes instructions even after it has been manufactured(Sony Vaio VGN-CR13T/W Battery).
After the fetch and decode steps, the execute step is performed. During this step, various portions of the CPU are connected so they can perform the desired operation. If, for instance, an addition operation was requested, the arithmetic logic unit (ALU) will be connected to a set of inputs and a set of outputs(Sony Vaio VGN-CR150E/B Battery). The inputs provide the numbers to be added, and the outputs will contain the final sum. The ALU contains the circuitry to perform simple arithmetic and logical operations on the inputs (like addition and bitwise operations). If the addition operation produces a result too large for the CPU to handle, an arithmetic overflow flag in a flags register may also be set(Sony Vaio VGN-CR190 Battery).
The final step, writeback, simply "writes back" the results of the execute step to some form of memory. Very often the results are written to some internal CPU register for quick access by subsequent instructions. In other cases results may be written to slower, but cheaper and larger, main memory. Some types of instructions manipulate the program counter rather than directly produce result data(Sony Vaio VGN-CR190E/L Battery). These are generally called "jumps" and facilitate behavior like loops, conditional program execution (through the use of a conditional jump), and functions in programs.[7] Many instructions will also change the state of digits in a "flags" register. These flags can be used to influence how a program behaves, since they often indicate the outcome of various operations(Sony Vaio VGN-CR190E/P Battery). For example, one type of "compare" instruction considers two values and sets a number in the flags register according to which one is greater. This flag could then be used by a later jump instruction to determine program flow.
After the execution of the instruction and writeback of the resulting data, the entire process repeats, with the next instruction cycle normally fetching the next-in-sequence instruction because of the incremented value in the program counter(Sony Vaio VGN-CR190E/R Battery). If the completed instruction was a jump, the program counter will be modified to contain the address of the instruction that was jumped to, and program execution continues normally. In more complex CPUs than the one described here, multiple instructions can be fetched, decoded, and executed simultaneously. This section describes what is generally referred to as the "classic RISC pipeline"(Sony Vaio VGN-CR190E/W Battery), which in fact is quite common among the simple CPUs used in many electronic devices (often called microcontroller). It largely ignores the important role of CPU cache, and therefore the access stage of the pipeline.
[edit]Design and implementation
Main article: CPU design
The basic concept of a CPU is as follows:
Hardwired into a CPU's design is a list of basic operations it can perform, called an instruction set. Such operations may include adding or subtracting two numbers, comparing numbers(Sony Vaio VGN-CR21/B Battery), or jumping to a different part of a program. Each of these basic operations is represented by a particular sequence of bits; this sequence is called the opcode for that particular operation. Sending a particular opcode to a CPU will cause it to perform the operation represented by that opcode. To execute an instruction in a computer program(Sony Vaio VGN-CR21E/L Battery), the CPU uses the opcode for that instruction as well as its arguments (for instance the two numbers to be added, in the case of an addition operation). A computer program is therefore a sequence of instructions, with each instruction including an opcode and that operation's arguments(Sony Vaio VGN-CR21E/P Battery).
The actual mathematical operation for each instruction is performed by a subunit of the CPU known as the arithmetic logic unit or ALU. In addition to using its ALU to perform operations, a CPU is also responsible for reading the next instruction from memory, reading data specified in arguments from memory, and writing results to memory(Sony Vaio VGN-CR21E/W Battery).
In many CPU designs, an instruction set will clearly differentiate between operations that load data from memory, and those that perform math. In this case the data loaded from memory is stored in registers, and a mathematical operation takes no arguments but simply performs the math on the data in the registers and writes it to a new register, whose value a separate operation may then write to memory(Sony Vaio VGN-CR21S/L Battery).
[edit]Integer range
The way a CPU represents numbers is a design choice that affects the most basic ways in which the device functions. Some early digital computers used an electrical model of the common decimal (base ten) numeral system to represent numbers internally. A few other computers have used more exotic numeral systems like ternary (base three) (Sony Vaio VGN-CR21S/P Battery). Nearly all modern CPUs represent numbers in binary form, with each digit being represented by some two-valued physical quantity such as a "high" or "low" voltage.[8]
MOS 6502 microprocessor in a dual in-line package, an extremely popular 8-bit design
Related to number representation is the size and precision of numbers that a CPU can represent. In the case of a binary CPU, a bit refers to one significant place in the numbers a CPU deals with(Sony Vaio VGN-CR21S/W Battery). The number of bits (or numeral places) a CPU uses to represent numbers is often called "word size", "bit width", "data path width", or "integer precision" when dealing with strictly integer numbers (as opposed to floating point). This number differs between architectures, and often within different parts of the very same CPU. For example, an 8-bit CPU deals with a range of numbers that can be represented by eight binary digits (Sony Vaio VGN-CR21Z/N Battery) (each digit having two possible values), that is, 28 or 256 discrete numbers. In effect, integer size sets a hardware limit on the range of integers the software run by the CPU can utilize.[9]
Integer range can also affect the number of locations in memory the CPU can address (locate). For example, if a binary CPU uses 32 bits to represent a memory address, and each memory address represents one octet (8 bits) (Sony Vaio VGN-CR21Z/R Battery), the maximum quantity of memory that CPU can address is 232 octets, or 4 GiB. This is a very simple view of CPU address space, and many designs use more complex addressing methods like paging in order to locate more memory than their integer range would allow with a flat address space(Sony Vaio VGN-CR220E/R Battery).
Higher levels of integer range require more structures to deal with the additional digits, and therefore more complexity, size, power usage, and general expense. It is not at all uncommon, therefore, to see 4- or 8-bit microcontrollers used in modern applications, even though CPUs with much higher range (such as 16, 32, 64, even 128-bit) are available(Sony Vaio VGN-CR23/B Battery). The simpler microcontrollers are usually cheaper, use less power, and therefore generate less heat, all of which can be major design considerations for electronic devices. However, in higher-end applications, the benefits afforded by the extra range (most often the additional address space) are more significant and often affect design choices(Sony Vaio VGN-CR23/P Battery). To gain some of the advantages afforded by both lower and higher bit lengths, many CPUs are designed with different bit widths for different portions of the device. For example, the IBM System/370 used a CPU that was primarily 32 bit, but it used 128-bit precision inside its floating point units to facilitate greater accuracy and range in floating point numbers.[3] Many later CPU designs use similar mixed bit width(Sony Vaio VGN-CR23/R Battery), especially when the processor is meant for general-purpose usage where a reasonable balance of integer and floating point capability is required.
[edit]Clock rate
Main article: Clock rate
The clock rate is the speed at which a microprocessor executes instructions. Every computer contains an internal clock that regulates the rate at which instructions are executed and synchronizes all the various computer components(Sony Vaio VGN-CR23/L Battery). The CPU requires a fixed number of clock ticks (or clock cycles) to execute each instruction. The faster the clock, the more instructions the CPU can execute per second.
Most CPUs, and indeed most sequential logic devices, are synchronous in nature.[10] That is, they are designed and operate on assumptions about a synchronization signal. This signal, known as a clock signal(Sony Vaio VGN-CR23/N Battery), usually takes the form of a periodic square wave. By calculating the maximum time that electrical signals can move in various branches of a CPU's many circuits, the designers can select an appropriate period for the clock signal.
This period must be longer than the amount of time it takes for a signal to move, or propagate, in the worst-case scenario. In setting the clock period to a value well above the worst-case propagation delay(Sony Vaio VGN-CR23/W Battery), it is possible to design the entire CPU and the way it moves data around the "edges" of the rising and falling clock signal. This has the advantage of simplifying the CPU significantly, both from a design perspective and a component-count perspective. However, it also carries the disadvantage that the entire CPU must wait on its slowest elements(Sony VAIO VGN-NW21EF/S battery), even though some portions of it are much faster. This limitation has largely been compensated for by various methods of increasing CPU parallelism. (see below)
However, architectural improvements alone do not solve all of the drawbacks of globally synchronous CPUs. For example, a clock signal is subject to the delays of any other electrical signal. Higher clock rates in increasingly complex CPUs make it more difficult to keep the clock signal in phase (synchronized) throughout the entire unit(Sony VAIO VGN-NW21JF battery). This has led many modern CPUs to require multiple identical clock signals to be provided in order to avoid delaying a single signal significantly enough to cause the CPU to malfunction. Another major issue as clock rates increase dramatically is the amount of heat that is dissipated by the CPU. The constantly changing clock causes many components to switch regardless of whether they are being used at that time(Sony VAIO VGN-NW21MF battery). In general, a component that is switching uses more energy than an element in a static state. Therefore, as clock rate increases, so does heat dissipation, causing the CPU to require more effective cooling solutions.
One method of dealing with the switching of unneeded components is called clock gating, which involves turning off the clock signal to unneeded components (effectively disabling them) (Sony VAIO VGN-NW21MF/W battery). However, this is often regarded as difficult to implement and therefore does not see common usage outside of very low-power designs. One notable late CPU design that uses clock gating is that of the IBM PowerPC-based Xbox 360. It utilizes extensive clock gating in order to reduce the power requirements of the aforementioned videogame console in which it is used(Sony VAIO VGN-NW31EF/W battery).[11] Another method of addressing some of the problems with a global clock signal is the removal of the clock signal altogether. While removing the global clock signal makes the design process considerably more complex in many ways, asynchronous (or clockless) designs carry marked advantages in power consumption and heat dissipation in comparison with similar synchronous designs(Sony VAIO VGN-NW21ZF battery). While somewhat uncommon, entire asynchronous CPUs have been built without utilizing a global clock signal. Two notable examples of this are the ARM compliant AMULET and the MIPS R3000 compatible MiniMIPS. Rather than totally removing the clock signal, some CPU designs allow certain portions of the device to be asynchronous(Sony VAIO VGN-NW31JF battery), such as using asynchronous ALUs in conjunction with superscalar pipelining to achieve some arithmetic performance gains. While it is not altogether clear whether totally asynchronous designs can perform at a comparable or better level than their synchronous counterparts, it is evident that they do at least excel in simpler math operations. This, combined with their excellent power consumption and heat dissipation properties(Sony VAIO VGN-NW320F/B battery), makes them very suitable for embedded computers.[12]
[edit]Parallelism
Main article: Parallel computing
Model of a subscalar CPU. Notice that it takes fifteen cycles to complete three instructions.
The description of the basic operation of a CPU offered in the previous section describes the simplest form that a CPU can take. This type of CPU, usually referred to as subscalar, operates on and executes one instruction on one or two pieces of data at a time(Sony VAIO VGN-NW320F/TC battery).
This process gives rise to an inherent inefficiency in subscalar CPUs. Since only one instruction is executed at a time, the entire CPU must wait for that instruction to complete before proceeding to the next instruction. As a result, the subscalar CPU gets "hung up" on instructions which take more than one clock cycle to complete execution. Even adding a second execution unit (see below) does not improve performance much(Sony VAIO VGN-NW11S/S battery); rather than one pathway being hung up, now two pathways are hung up and the number of unused transistors is increased. This design, wherein the CPU's execution resources can operate on only one instruction at a time, can only possibly reach scalar performance (one instruction per clock). However, the performance is nearly always subscalar (less than one instruction per cycle) (Sony VAIO VGN-NW11Z/S battery).
Attempts to achieve scalar and better performance have resulted in a variety of design methodologies that cause the CPU to behave less linearly and more in parallel. When referring to parallelism in CPUs, two terms are generally used to classify these design techniques(Sony VAIO VGN-NW11S/T battery). Instruction level parallelism (ILP) seeks to increase the rate at which instructions are executed within a CPU (that is, to increase the utilization of on-die execution resources), and thread level parallelism (TLP) purposes to increase the number of threads (effectively individual programs) that a CPU can execute simultaneously(Sony VAIO VGN-NW11Z/T battery). Each methodology differs both in the ways in which they are implemented, as well as the relative effectiveness they afford in increasing the CPU's performance for an application.[13]
[edit]Instruction level parallelism
Main articles: Instruction pipelining and Superscalar
Basic five-stage pipeline. In the best case scenario, this pipeline can sustain a completion rate of one instruction per cycle.
One of the simplest methods used to accomplish increased parallelism is to begin the first steps of instruction fetching and decoding before the prior instruction finishes executing(SONY VGP-BPS10A battery). This is the simplest form of a technique known as instruction pipelining, and is utilized in almost all modern general-purpose CPUs. Pipelining allows more than one instruction to be executed at any given time by breaking down the execution pathway into discrete stages. This separation can be compared to an assembly line, in which an instruction is made more complete at each stage until it exits the execution pipeline and is retired(SONY VGP-BPS10A/B battery).
Pipelining does, however, introduce the possibility for a situation where the result of the previous operation is needed to complete the next operation; a condition often termed data dependency conflict. To cope with this, additional care must be taken to check for these sorts of conditions and delay a portion of the instruction pipeline if this occurs(SONY VGP-BPS10/B battery). Naturally, accomplishing this requires additional circuitry, so pipelined processors are more complex than subscalar ones (though not very significantly so). A pipelined processor can become very nearly scalar, inhibited only by pipeline stalls (an instruction spending more than one clock cycle in a stage).
Simple superscalar pipeline. By fetching and dispatching two instructions at a time, a maximum of two instructions per cycle can be completed(SONY VGP-BPS10/S battery).
Further improvement upon the idea of instruction pipelining led to the development of a method that decreases the idle time of CPU components even further. Designs that are said to be superscalar include a long instruction pipeline and multiple identical execution units.[14] In a superscalar pipeline, multiple instructions are read and passed to a dispatcher(SONY Vaio VGN-SR11M Battery), which decides whether or not the instructions can be executed in parallel (simultaneously). If so they are dispatched to available execution units, resulting in the ability for several instructions to be executed simultaneously. In general, the more instructions a superscalar CPU is able to dispatch simultaneously to waiting execution units, the more instructions will be completed in a given cycle(SONY Vaio VGN-SR12G/B Battery).
Most of the difficulty in the design of a superscalar CPU architecture lies in creating an effective dispatcher. The dispatcher needs to be able to quickly and correctly determine whether instructions can be executed in parallel, as well as dispatch them in such a way as to keep as many execution units busy as possible(SONY Vaio VGN-SR12G/P Battery). This requires that the instruction pipeline is filled as often as possible and gives rise to the need in superscalar architectures for significant amounts of CPU cache. It also makes hazard-avoiding techniques like branch prediction, speculative execution, and out-of-order execution crucial to maintaining high levels of performance(SONY Vaio VGN-SR12G/S Battery). By attempting to predict which branch (or path) a conditional instruction will take, the CPU can minimize the number of times that the entire pipeline must wait until a conditional instruction is completed. Speculative execution often provides modest performance increases by executing portions of code that may not be needed after a conditional operation completes(SONY Vaio VGN-SR140E/S Battery). Out-of-order execution somewhat rearranges the order in which instructions are executed to reduce delays due to data dependencies. Also in case of Single Instructions Multiple Data — a case when a lot of data from the same type has to be processed, modern processors can disable parts of the pipeline so that when a single instruction is executed many times(SONY Vaio VGN-SR165E/B Battery), the CPU skips the fetch and decode phases and thus greatly increases performance on certain occasions, especially in highly monotonous program engines such as video creation software and photo processing.
In the case where a portion of the CPU is superscalar and part is not, the part which is not suffers a performance penalty due to scheduling stalls. The Intel P5 Pentium had two superscalar ALUs which could accept one instruction per clock each(SONY Vaio VGN-SR165E/P Battery), but its FPU could not accept one instruction per clock. Thus the P5 was integer superscalar but not floating point superscalar. Intel's successor to the P5 architecture, P6, added superscalar capabilities to its floating point features, and therefore afforded a significant increase in floating point instruction performance(SONY Vaio VGN-SR165E/S Battery).
Both simple pipelining and superscalar design increase a CPU's ILP by allowing a single processor to complete execution of instructions at rates surpassing one instruction per cycle (IPC).[15] Most modern CPU designs are at least somewhat superscalar, and nearly all general purpose CPUs designed in the last decade are superscalar(Sony VAIO VGN-SR175N/B battery). In later years some of the emphasis in designing high-ILP computers has been moved out of the CPU's hardware and into its software interface, or ISA. The strategy of the very long instruction word (VLIW) causes some ILP to become implied directly by the software, reducing the amount of work the CPU must perform to boost ILP and thereby reducing the design's complexity(Sony VAIO VGN-SR19VN battery).
[edit]Thread-level parallelism
Another strategy of achieving performance is to execute multiple programs or threads in parallel. This area of research is known as parallel computing. In Flynn's taxonomy, this strategy is known as Multiple Instructions-Multiple Data or MIMD.
One technology used for this purpose was multiprocessing (MP). The initial flavor of this technology is known as symmetric multiprocessing (SMP) (Sony VAIO VGN-SR19XN battery), where a small number of CPUs share a coherent view of their memory system. In this scheme, each CPU has additional hardware to maintain a constantly up-to-date view of memory. By avoiding stale views of memory, the CPUs can cooperate on the same program and programs can migrate from one CPU to another. To increase the number of cooperating CPUs beyond a handful(Sony VAIO VGN-SR21M/S battery), schemes such as non-uniform memory access (NUMA) and directory-based coherence protocols were introduced in the 1990s. SMP systems are limited to a small number of CPUs while NUMA systems have been built with thousands of processors. Initially, multiprocessing was built using multiple discrete CPUs and boards to implement the interconnect between the processors(Sony VAIO VGN-SR220J/B battery). When the processors and their interconnect are all implemented on a single silicon chip, the technology is known as a multi-core microprocessor.
It was later recognized that finer-grain parallelism existed with a single program. A single program might have several threads (or functions) that could be executed separately or in parallel(Sony VAIO VGN-SR220J/H battery). Some of the earliest examples of this technology implemented input/output processing such as direct memory access as a separate thread from the computation thread. A more general approach to this technology was introduced in the 1970s when systems were designed to run multiple computation threads in parallel(Sony VAIO VGN-SR23H/B battery). This technology is known as multi-threading (MT). This approach is considered more cost-effective than multiprocessing, as only a small number of components within a CPU is replicated in order to support MT as opposed to the entire CPU in the case of MP. In MT, the execution units and the memory system including the caches are shared among multiple threads(Sony VAIO VGN-SR240J/B battery). The downside of MT is that the hardware support for multithreading is more visible to software than that of MP and thus supervisor software like operating systems have to undergo larger changes to support MT. One type of MT that was implemented is known as block multithreading, where one thread is executed until it is stalled waiting for data to return from external memory(Sony VAIO VGN-SR240N/B battery). In this scheme, the CPU would then quickly switch to another thread which is ready to run, the switch often done in one CPU clock cycle, such as the UltraSPARC Technology. Another type of MT is known as simultaneous multithreading, where instructions of multiple threads are executed in parallel within one CPU clock cycle(Sony VAIO VGN-SR25G/B battery).
For several decades from the 1970s to early 2000s, the focus in designing high performance general purpose CPUs was largely on achieving high ILP through technologies such as pipelining, caches, superscalar execution, out-of-order execution, etc. This trend culminated in large, power-hungry CPUs such as the Intel Pentium 4(Sony VAIO VGN-SR25G/P battery). By the early 2000s, CPU designers were thwarted from achieving higher performance from ILP techniques due to the growing disparity between CPU operating frequencies and main memory operating frequencies as well as escalating CPU power dissipation owing to more esoteric ILP techniques(Sony VAIO VGN-SR25G/S battery).
CPU designers then borrowed ideas from commercial computing markets such as transaction processing, where the aggregate performance of multiple programs, also known as throughput computing, was more important than the performance of a single thread or program(Sony VAIO VGN-SR25M/B battery).
This reversal of emphasis is evidenced by the proliferation of dual and multiple core CMP (chip-level multiprocessing) designs and notably, Intel's newer designs resembling its less superscalar P6 architecture. Late designs in several processor families exhibit CMP, including the x86-64 Opteron and Athlon 64 X2, the SPARC UltraSPARC T1(Sony VAIO VGN-SR25S/B battery), IBM POWER4 and POWER5, as well as several video game console CPUs like the Xbox 360's triple-core PowerPC design, and the PS3's 7-core Cell microprocessor.
[edit]Data parallelism
Main articles: Vector processor and SIMD
A less common but increasingly important paradigm of CPUs (and indeed, computing in general) deals with data parallelism. The processors discussed earlier are all referred to as some type of scalar device.[16] As the name implies, vector processors deal with multiple pieces of data in the context of one instruction(Sony VAIO VGN-SR25T/P battery). This contrasts with scalar processors, which deal with one piece of data for every instruction. Using Flynn's taxonomy, these two schemes of dealing with data are generally referred to as SIMD (single instruction, multiple data) and SISD (single instruction, single data), respectively(Sony VAIO VGN-SR25T/S battery). The great utility in creating CPUs that deal with vectors of data lies in optimizing tasks that tend to require the same operation (for example, a sum or a dot product) to be performed on a large set of data. Some classic examples of these types of tasks are multimedia applications (images, video, and sound), as well as many types of scientific and engineering tasks(Sony VAIO VGN-SR26/B battery). Whereas a scalar CPU must complete the entire process of fetching, decoding, and executing each instruction and value in a set of data, a vector CPU can perform a single operation on a comparatively large set of data with one instruction. Of course, this is only possible when the application tends to require many steps which apply one operation to a large set of data(Sony VAIO VGN-SR26/P battery).
Most early vector CPUs, such as the Cray-1, were associated almost exclusively with scientific research and cryptography applications. However, as multimedia has largely shifted to digital media, the need for some form of SIMD in general-purpose CPUs has become significant. Shortly after inclusion of floating point execution units started to become commonplace in general-purpose processors(Sony VAIO VGN-SR26/S battery), specifications for and implementations of SIMD execution units also began to appear for general-purpose CPUs. Some of these early SIMD specifications like HP's Multimedia Acceleration eXtensions (MAX) and Intel's MMX were integer-only. This proved to be a significant impediment for some software developers, since many of the applications that benefit from SIMD primarily deal with floating point numbers(Sony VAIO VGN-SR27TN/B battery). Progressively, these early designs were refined and remade into some of the common, modern SIMD specifications, which are usually associated with one ISA. Some notable modern examples are Intel's SSE and the PowerPC-related AltiVec (also known as VMX).[17]
[edit]Performance
Main article: computer performance
The performance or speed of a processor depends on the clock rate (generally given in multiples of hertz) and the instructions per clock (IPC) (Sony VAIO VGN-SR28/B battery), which together are the factors for the instructions per second (IPS) that the CPU can perform.[18] Many reported IPS values have represented "peak" execution rates on artificial instruction sequences with few branches, whereas realistic workloads consist of a mix of instructions and applications(Sony VAIO VGN-SR28/J battery), some of which take longer to execute than others. The performance of the memory hierarchy also greatly affects processor performance, an issue barely considered in MIPS calculations. Because of these problems, various standardized tests, often called "benchmarks" for this purpose—such as SPECint -- have been developed to attempt to measure the real effective performance in commonly used applications(Sony VAIO VGN-SR28/Q battery).
Processing performance of computers is increased by using multi-core processors, which essentially is plugging two or more individual processors (called cores in this sense) into one integrated circuit.[19] Ideally, a dual core processor would be nearly twice as powerful as a single core processor. In practice, however, the performance gain is far less, only about 50%,[19] due to imperfect software algorithms and implementation(Sony VAIO VGN-SR29VN/S battery).
On large machines, CPUs require one or more printed circuit boards. On personal computers and small workstations(Sony VGN-CR11Z Battery), the CPU is housed in a single silicon chip called a microprocessor. Since the 1970s the microprocessor class of CPUs has almost completely overtaken all other CPU implementations. Modern CPUs are large scale integrated circuits in packages typically less than four centimeters square, with hundreds of connecting pins(Sony VGN-CR11S Battery).
Two typical components of a CPU are the arithmetic logic unit (ALU), which performs arithmetic and logical operations, and the control unit (CU), which extracts instructions from memory and decodes and executes them, calling on the ALU when necessary.
Not all computational systems rely on a central processing unit. An array processor or vector processor has multiple parallel computing elements(Sony VGN-CR11M Battery), with no one unit considered the "center". In the distributed computing model, problems are solved by a distributed interconnected set of processors.
History
Main article: History of general purpose CPUs
EDVAC, one of the first stored program computers
Computers such as the ENIAC had to be physically rewired in order to perform different tasks, which caused these machines to be called "fixed-program computers." Since the term "CPU" is generally defined as a device for software (computer program) execution, the earliest devices that could rightly be called CPUs came with the advent of the stored-program computer(Sony VGN-CR11E Battery).
The idea of a stored-program computer was already present in the design of J. Presper Eckert and John William Mauchly's ENIAC, but was initially omitted so that it could be finished sooner. On June 30, 1945, before ENIAC was made, mathematician John von Neumann distributed the paper entitled First Draft of a Report on the EDVAC(Sony VGN-CR21E Battery). It was the outline of a stored-program computer that would eventually be completed in August 1949.[2] EDVAC was designed to perform a certain number of instructions (or operations) of various types. These instructions could be combined to create useful programs for the EDVAC to run. Significantly, the programs written for EDVAC were stored in high-speed computer memory rather than specified by the physical wiring of the computer(Sony VGN-CR21S Battery). This overcame a severe limitation of ENIAC, which was the considerable time and effort required to reconfigure the computer to perform a new task. With von Neumann's design, the program, or software, that EDVAC ran could be changed simply by changing the contents of the memory(Sony VGN-CR21Z Battery).
Early CPUs were custom-designed as a part of a larger, sometimes one-of-a-kind, computer. However, this method of designing custom CPUs for a particular application has largely given way to the development of mass-produced processors that are made for many purposes. (Sony VGN-CR21SR Battery) This standardization began in the era of discrete transistor mainframes and minicomputers and has rapidly accelerated with the popularization of the integrated circuit (IC). The IC has allowed increasingly complex CPUs to be designed and manufactured to tolerances on the order of nanometers. Both the miniaturization and standardization of CPUs have increased the presence of digital devices in modern life far(Sony VGN-CR31SR Battery) beyond the limited application of dedicated computing machines. Modern microprocessors appear in everything from automobiles to cell phones and children's toys.
While von Neumann is most often credited with the design of the stored-program computer because of his design of EDVAC, others before him, such as Konrad Zuse, had suggested and implemented similar ideas. The so-called Harvard architecture of the Harvard Mark I, which was completed before EDVAC(Sony VGN-CR31S Battery), also utilized a stored-program design using punched paper tape rather than electronic memory. The key difference between the von Neumann and Harvard architectures is that the latter separates the storage and treatment of CPU instructions and data, while the former uses the same memory space for both(Sony VGN-CR31E Battery). Most modern CPUs are primarily von Neumann in design, but elements of the Harvard architecture are commonly seen as well.
Relays and vacuum tubes (thermionic valves) were commonly used as switching elements; a useful computer requires thousands or tens of thousands of switching devices. The overall speed of a system is dependent on the speed of the switches(Sony VGN-CR31Z Battery). Tube computers like EDVAC tended to average eight hours between failures, whereas relay computers like the (slower, but earlier) Harvard Mark I failed very rarely.[1] In the end, tube based CPUs became dominant because the significant speed advantages afforded generally outweighed the reliability problems(Sony VGN-CR41Z Battery). Most of these early synchronous CPUs ran at low clock rates compared to modern microelectronic designs (see below for a discussion of clock rate). Clock signal frequencies ranging from 100 kHz to 4 MHz were very common at this time, limited largely by the speed of the switching devices they were built with(Sony VGN-CR41S Battery).
[edit]Control unit
Main article: Control unit
The control unit of the CPU contains circuitry that uses electrical signals to direct the entire computer system to carry out stored program instructions. The control unit does not execute program instructions; rather, it directs other parts of the system to do so. The control unit must communicate with both the arithmetic/logic unit and memory(Sony VGN-CR41E Battery).
[edit]Discrete transistor and integrated circuit CPUs
CPU, core memory, and external bus interface of a DEC PDP-8/I. Made of medium-scale integrated circuits
The design complexity of CPUs increased as various technologies facilitated building smaller and more reliable electronic devices. The first such improvement came with the advent of the transistor. Transistorized CPUs during the 1950s and 1960s no longer had to be built out of bulky(Sony VGN-CR41SR Battery), unreliable, and fragile switching elements like vacuum tubes and electrical relays. With this improvement more complex and reliable CPUs were built onto one or several printed circuit boards containing discrete (individual) components.
During this period, a method of manufacturing many transistors in a compact space gained popularity. The integrated circuit (IC) allowed a large number of transistors to be manufactured on a single semiconductor-based die(Sony VGN-CR42Z Battery), or "chip." At first only very basic non-specialized digital circuits such as NOR gates were miniaturized into ICs. CPUs based upon these "building block" ICs are generally referred to as "small-scale integration" (SSI) devices. SSI ICs, such as the ones used in the Apollo guidance computer, usually contained up to a few score transistors(Sony VGN-CR42S Battery). To build an entire CPU out of SSI ICs required thousands of individual chips, but still consumed much less space and power than earlier discrete transistor designs. As microelectronic technology advanced, an increasing number of transistors were placed on ICs, thus decreasing the quantity of individual ICs needed for a complete CPU. MSI and LSI (medium- and large-scale integration) (Sony VGN-CR42E Battery) ICs increased transistor counts to hundreds, and then thousands.
In 1964 IBM introduced its System/360 computer architecture which was used in a series of computers that could run the same programs with different speed and performance. This was significant at a time when most electronic computers were incompatible with one another, even those made by the same manufacturer(Sony Vaio VGN-CR11S/L Battery). To facilitate this improvement, IBM utilized the concept of a microprogram (often called "microcode"), which still sees widespread usage in modern CPUs.[3] The System/360 architecture was so popular that it dominated the mainframe computer market for decades and left a legacy that is still continued by similar modern computers like the IBM zSeries(Sony Vaio VGN-CR11S/P Battery). In the same year (1964), Digital Equipment Corporation (DEC) introduced another influential computer aimed at the scientific and research markets, the PDP-8. DEC would later introduce the extremely popular PDP-11 line that originally was built with SSI ICs but was eventually implemented with LSI components once these became practical(Sony Vaio VGN-CR11S/W Battery). In stark contrast with its SSI and MSI predecessors, the first LSI implementation of the PDP-11 contained a CPU composed of only four LSI integrated circuits.[4]
Transistor-based computers had several distinct advantages over their predecessors. Aside from facilitating increased reliability and lower power consumption, transistors also allowed CPUs to operate at much higher speeds because of the short switching time of a transistor in comparison to a tube or relay(Sony Vaio VGN-CR11Z/R Battery). Thanks to both the increased reliability as well as the dramatically increased speed of the switching elements (which were almost exclusively transistors by this time), CPU clock rates in the tens of megahertz were obtained during this period. Additionally while discrete transistor and IC CPUs were in heavy usage(Sony Vaio VGN-CR13/B Battery), new high-performance designs like SIMD (Single Instruction Multiple Data) vector processors began to appear. These early experimental designs later gave rise to the era of specialized supercomputers like those made by Cray Inc.
[edit]Microprocessors
This section does not cite any references or sources. Please help improve this section by adding citations to reliable sources. Unsourced material may be challenged and removed(Sony Vaio VGN-CR13/L Battery). (October 2009)
Main article: Microprocessor
Die of an Intel 80486DX2 microprocessor (actual size: 12×6.75 mm) in its packaging
Intel Core i5 CPU on a Vaio E series laptop motherboard (on the right, beneath the heat pipe).
In the 1970s the fundamental inventions by Federico Faggin (Silicon Gate MOS ICs with self aligned gates along with his new random logic design methodology) changed the design and implementation of CPUs forever(Sony Vaio VGN-CR13/P Battery). Since the introduction of the first commercially available microprocessor (the Intel 4004), in 1970 and the first widely used microprocessor (the Intel 8080) in 1974, this class of CPUs has almost completely overtaken all other central processing unit implementation methods. Mainframe and minicomputer manufacturers of the time launched proprietary IC development programs to upgrade their older computer architectures(Sony Vaio VGN-CR13/R Battery), and eventually produced instruction set compatible microprocessors that were backward-compatible with their older hardware and software. Combined with the advent and eventual vast success of the now ubiquitous personal computer, the term CPU is now applied almost exclusively to microprocessors. Several CPUs can be combined in a single processing chip(Sony Vaio VGN-CR13/W Battery).
Previous generations of CPUs were implemented as discrete components and numerous small integrated circuits (ICs) on one or more circuit boards. Microprocessors, on the other hand, are CPUs manufactured on a very small number of ICs; usually just one. The overall smaller CPU size as a result of being implemented on a single die means faster switching time because of physical factors like decreased gate parasitic capacitance(Sony Vaio VGN-CR13G Battery). This has allowed synchronous microprocessors to have clock rates ranging from tens of megahertz to several gigahertz. Additionally, as the ability to construct exceedingly small transistors on an IC has increased, the complexity and number of transistors in a single CPU has increased dramatically. This widely observed trend is described by Moore's law, which has proven to be a fairly accurate predictor of the growth of CPU (and other IC) complexity to date(Sony Vaio VGN-CR13G/B Battery).
While the complexity, size, construction, and general form of CPUs have changed drastically over the past sixty years, it is notable that the basic design and function has not changed much at all. Almost all common CPUs today can be very accurately described as von Neumann stored-program machines(Sony Vaio VGN-CR13G/L Battery). As the aforementioned Moore's law continues to hold true, concerns have arisen about the limits of integrated circuit transistor technology. Extreme miniaturization of electronic gates is causing the effects of phenomena like electromigration and subthreshold leakage to become much more significant. These newer concerns are among the many factors causing researchers to investigate new methods of computing such as the quantum computer(Sony Vaio VGN-CR13G/W Battery), as well as to expand the usage of parallelism and other methods that extend the usefulness of the classical von Neumann model.
[edit]Operation
The fundamental operation of most CPUs, regardless of the physical form they take, is to execute a sequence of stored instructions called a program. The program is represented by a series of numbers that are kept in some kind of computer memory. There are four steps that nearly all CPUs use in their operation: fetch, decode, execute, and writeback(Sony Vaio VGN-CR13G/P Battery).
The first step, fetch, involves retrieving an instruction (which is represented by a number or sequence of numbers) from program memory. The location in program memory is determined by a program counter (PC), which stores a number that identifies the current position in the program. After an instruction is fetched, the PC is incremented by the length of the instruction word in terms of memory units(Sony Vaio VGN-CR13G/R Battery).[5] Often, the instruction to be fetched must be retrieved from relatively slow memory, causing the CPU to stall while waiting for the instruction to be returned. This issue is largely addressed in modern processors by caches and pipeline architectures (see below).
The instruction that the CPU fetches from memory is used to determine what the CPU is to do. In the decode step(Sony Vaio VGN-CR13T/L Battery), the instruction is broken up into parts that have significance to other portions of the CPU. The way in which the numerical instruction value is interpreted is defined by the CPU's instruction set architecture (ISA).[6] Often, one group of numbers in the instruction, called the opcode, indicates which operation to perform(Sony Vaio VGN-CR13T/P Battery). The remaining parts of the number usually provide information required for that instruction, such as operands for an addition operation. Such operands may be given as a constant value (called an immediate value), or as a place to locate a value: a register or a memory address, as determined by some addressing mode. In older designs the portions of the CPU responsible for instruction decoding were unchangeable hardware devices(Sony Vaio VGN-CR13T/R Battery). However, in more abstract and complicated CPUs and ISAs, a microprogram is often used to assist in translating instructions into various configuration signals for the CPU. This microprogram is sometimes rewritable so that it can be modified to change the way the CPU decodes instructions even after it has been manufactured(Sony Vaio VGN-CR13T/W Battery).
After the fetch and decode steps, the execute step is performed. During this step, various portions of the CPU are connected so they can perform the desired operation. If, for instance, an addition operation was requested, the arithmetic logic unit (ALU) will be connected to a set of inputs and a set of outputs(Sony Vaio VGN-CR150E/B Battery). The inputs provide the numbers to be added, and the outputs will contain the final sum. The ALU contains the circuitry to perform simple arithmetic and logical operations on the inputs (like addition and bitwise operations). If the addition operation produces a result too large for the CPU to handle, an arithmetic overflow flag in a flags register may also be set(Sony Vaio VGN-CR190 Battery).
The final step, writeback, simply "writes back" the results of the execute step to some form of memory. Very often the results are written to some internal CPU register for quick access by subsequent instructions. In other cases results may be written to slower, but cheaper and larger, main memory. Some types of instructions manipulate the program counter rather than directly produce result data(Sony Vaio VGN-CR190E/L Battery). These are generally called "jumps" and facilitate behavior like loops, conditional program execution (through the use of a conditional jump), and functions in programs.[7] Many instructions will also change the state of digits in a "flags" register. These flags can be used to influence how a program behaves, since they often indicate the outcome of various operations(Sony Vaio VGN-CR190E/P Battery). For example, one type of "compare" instruction considers two values and sets a number in the flags register according to which one is greater. This flag could then be used by a later jump instruction to determine program flow.
After the execution of the instruction and writeback of the resulting data, the entire process repeats, with the next instruction cycle normally fetching the next-in-sequence instruction because of the incremented value in the program counter(Sony Vaio VGN-CR190E/R Battery). If the completed instruction was a jump, the program counter will be modified to contain the address of the instruction that was jumped to, and program execution continues normally. In more complex CPUs than the one described here, multiple instructions can be fetched, decoded, and executed simultaneously. This section describes what is generally referred to as the "classic RISC pipeline"(Sony Vaio VGN-CR190E/W Battery), which in fact is quite common among the simple CPUs used in many electronic devices (often called microcontroller). It largely ignores the important role of CPU cache, and therefore the access stage of the pipeline.
[edit]Design and implementation
Main article: CPU design
The basic concept of a CPU is as follows:
Hardwired into a CPU's design is a list of basic operations it can perform, called an instruction set. Such operations may include adding or subtracting two numbers, comparing numbers(Sony Vaio VGN-CR21/B Battery), or jumping to a different part of a program. Each of these basic operations is represented by a particular sequence of bits; this sequence is called the opcode for that particular operation. Sending a particular opcode to a CPU will cause it to perform the operation represented by that opcode. To execute an instruction in a computer program(Sony Vaio VGN-CR21E/L Battery), the CPU uses the opcode for that instruction as well as its arguments (for instance the two numbers to be added, in the case of an addition operation). A computer program is therefore a sequence of instructions, with each instruction including an opcode and that operation's arguments(Sony Vaio VGN-CR21E/P Battery).
The actual mathematical operation for each instruction is performed by a subunit of the CPU known as the arithmetic logic unit or ALU. In addition to using its ALU to perform operations, a CPU is also responsible for reading the next instruction from memory, reading data specified in arguments from memory, and writing results to memory(Sony Vaio VGN-CR21E/W Battery).
In many CPU designs, an instruction set will clearly differentiate between operations that load data from memory, and those that perform math. In this case the data loaded from memory is stored in registers, and a mathematical operation takes no arguments but simply performs the math on the data in the registers and writes it to a new register, whose value a separate operation may then write to memory(Sony Vaio VGN-CR21S/L Battery).
[edit]Integer range
The way a CPU represents numbers is a design choice that affects the most basic ways in which the device functions. Some early digital computers used an electrical model of the common decimal (base ten) numeral system to represent numbers internally. A few other computers have used more exotic numeral systems like ternary (base three) (Sony Vaio VGN-CR21S/P Battery). Nearly all modern CPUs represent numbers in binary form, with each digit being represented by some two-valued physical quantity such as a "high" or "low" voltage.[8]
MOS 6502 microprocessor in a dual in-line package, an extremely popular 8-bit design
Related to number representation is the size and precision of numbers that a CPU can represent. In the case of a binary CPU, a bit refers to one significant place in the numbers a CPU deals with(Sony Vaio VGN-CR21S/W Battery). The number of bits (or numeral places) a CPU uses to represent numbers is often called "word size", "bit width", "data path width", or "integer precision" when dealing with strictly integer numbers (as opposed to floating point). This number differs between architectures, and often within different parts of the very same CPU. For example, an 8-bit CPU deals with a range of numbers that can be represented by eight binary digits (Sony Vaio VGN-CR21Z/N Battery) (each digit having two possible values), that is, 28 or 256 discrete numbers. In effect, integer size sets a hardware limit on the range of integers the software run by the CPU can utilize.[9]
Integer range can also affect the number of locations in memory the CPU can address (locate). For example, if a binary CPU uses 32 bits to represent a memory address, and each memory address represents one octet (8 bits) (Sony Vaio VGN-CR21Z/R Battery), the maximum quantity of memory that CPU can address is 232 octets, or 4 GiB. This is a very simple view of CPU address space, and many designs use more complex addressing methods like paging in order to locate more memory than their integer range would allow with a flat address space(Sony Vaio VGN-CR220E/R Battery).
Higher levels of integer range require more structures to deal with the additional digits, and therefore more complexity, size, power usage, and general expense. It is not at all uncommon, therefore, to see 4- or 8-bit microcontrollers used in modern applications, even though CPUs with much higher range (such as 16, 32, 64, even 128-bit) are available(Sony Vaio VGN-CR23/B Battery). The simpler microcontrollers are usually cheaper, use less power, and therefore generate less heat, all of which can be major design considerations for electronic devices. However, in higher-end applications, the benefits afforded by the extra range (most often the additional address space) are more significant and often affect design choices(Sony Vaio VGN-CR23/P Battery). To gain some of the advantages afforded by both lower and higher bit lengths, many CPUs are designed with different bit widths for different portions of the device. For example, the IBM System/370 used a CPU that was primarily 32 bit, but it used 128-bit precision inside its floating point units to facilitate greater accuracy and range in floating point numbers.[3] Many later CPU designs use similar mixed bit width(Sony Vaio VGN-CR23/R Battery), especially when the processor is meant for general-purpose usage where a reasonable balance of integer and floating point capability is required.
[edit]Clock rate
Main article: Clock rate
The clock rate is the speed at which a microprocessor executes instructions. Every computer contains an internal clock that regulates the rate at which instructions are executed and synchronizes all the various computer components(Sony Vaio VGN-CR23/L Battery). The CPU requires a fixed number of clock ticks (or clock cycles) to execute each instruction. The faster the clock, the more instructions the CPU can execute per second.
Most CPUs, and indeed most sequential logic devices, are synchronous in nature.[10] That is, they are designed and operate on assumptions about a synchronization signal. This signal, known as a clock signal(Sony Vaio VGN-CR23/N Battery), usually takes the form of a periodic square wave. By calculating the maximum time that electrical signals can move in various branches of a CPU's many circuits, the designers can select an appropriate period for the clock signal.
This period must be longer than the amount of time it takes for a signal to move, or propagate, in the worst-case scenario. In setting the clock period to a value well above the worst-case propagation delay(Sony Vaio VGN-CR23/W Battery), it is possible to design the entire CPU and the way it moves data around the "edges" of the rising and falling clock signal. This has the advantage of simplifying the CPU significantly, both from a design perspective and a component-count perspective. However, it also carries the disadvantage that the entire CPU must wait on its slowest elements(Sony VAIO VGN-NW21EF/S battery), even though some portions of it are much faster. This limitation has largely been compensated for by various methods of increasing CPU parallelism. (see below)
However, architectural improvements alone do not solve all of the drawbacks of globally synchronous CPUs. For example, a clock signal is subject to the delays of any other electrical signal. Higher clock rates in increasingly complex CPUs make it more difficult to keep the clock signal in phase (synchronized) throughout the entire unit(Sony VAIO VGN-NW21JF battery). This has led many modern CPUs to require multiple identical clock signals to be provided in order to avoid delaying a single signal significantly enough to cause the CPU to malfunction. Another major issue as clock rates increase dramatically is the amount of heat that is dissipated by the CPU. The constantly changing clock causes many components to switch regardless of whether they are being used at that time(Sony VAIO VGN-NW21MF battery). In general, a component that is switching uses more energy than an element in a static state. Therefore, as clock rate increases, so does heat dissipation, causing the CPU to require more effective cooling solutions.
One method of dealing with the switching of unneeded components is called clock gating, which involves turning off the clock signal to unneeded components (effectively disabling them) (Sony VAIO VGN-NW21MF/W battery). However, this is often regarded as difficult to implement and therefore does not see common usage outside of very low-power designs. One notable late CPU design that uses clock gating is that of the IBM PowerPC-based Xbox 360. It utilizes extensive clock gating in order to reduce the power requirements of the aforementioned videogame console in which it is used(Sony VAIO VGN-NW31EF/W battery).[11] Another method of addressing some of the problems with a global clock signal is the removal of the clock signal altogether. While removing the global clock signal makes the design process considerably more complex in many ways, asynchronous (or clockless) designs carry marked advantages in power consumption and heat dissipation in comparison with similar synchronous designs(Sony VAIO VGN-NW21ZF battery). While somewhat uncommon, entire asynchronous CPUs have been built without utilizing a global clock signal. Two notable examples of this are the ARM compliant AMULET and the MIPS R3000 compatible MiniMIPS. Rather than totally removing the clock signal, some CPU designs allow certain portions of the device to be asynchronous(Sony VAIO VGN-NW31JF battery), such as using asynchronous ALUs in conjunction with superscalar pipelining to achieve some arithmetic performance gains. While it is not altogether clear whether totally asynchronous designs can perform at a comparable or better level than their synchronous counterparts, it is evident that they do at least excel in simpler math operations. This, combined with their excellent power consumption and heat dissipation properties(Sony VAIO VGN-NW320F/B battery), makes them very suitable for embedded computers.[12]
[edit]Parallelism
Main article: Parallel computing
Model of a subscalar CPU. Notice that it takes fifteen cycles to complete three instructions.
The description of the basic operation of a CPU offered in the previous section describes the simplest form that a CPU can take. This type of CPU, usually referred to as subscalar, operates on and executes one instruction on one or two pieces of data at a time(Sony VAIO VGN-NW320F/TC battery).
This process gives rise to an inherent inefficiency in subscalar CPUs. Since only one instruction is executed at a time, the entire CPU must wait for that instruction to complete before proceeding to the next instruction. As a result, the subscalar CPU gets "hung up" on instructions which take more than one clock cycle to complete execution. Even adding a second execution unit (see below) does not improve performance much(Sony VAIO VGN-NW11S/S battery); rather than one pathway being hung up, now two pathways are hung up and the number of unused transistors is increased. This design, wherein the CPU's execution resources can operate on only one instruction at a time, can only possibly reach scalar performance (one instruction per clock). However, the performance is nearly always subscalar (less than one instruction per cycle) (Sony VAIO VGN-NW11Z/S battery).
Attempts to achieve scalar and better performance have resulted in a variety of design methodologies that cause the CPU to behave less linearly and more in parallel. When referring to parallelism in CPUs, two terms are generally used to classify these design techniques(Sony VAIO VGN-NW11S/T battery). Instruction level parallelism (ILP) seeks to increase the rate at which instructions are executed within a CPU (that is, to increase the utilization of on-die execution resources), and thread level parallelism (TLP) purposes to increase the number of threads (effectively individual programs) that a CPU can execute simultaneously(Sony VAIO VGN-NW11Z/T battery). Each methodology differs both in the ways in which they are implemented, as well as the relative effectiveness they afford in increasing the CPU's performance for an application.[13]
[edit]Instruction level parallelism
Main articles: Instruction pipelining and Superscalar
Basic five-stage pipeline. In the best case scenario, this pipeline can sustain a completion rate of one instruction per cycle.
One of the simplest methods used to accomplish increased parallelism is to begin the first steps of instruction fetching and decoding before the prior instruction finishes executing(SONY VGP-BPS10A battery). This is the simplest form of a technique known as instruction pipelining, and is utilized in almost all modern general-purpose CPUs. Pipelining allows more than one instruction to be executed at any given time by breaking down the execution pathway into discrete stages. This separation can be compared to an assembly line, in which an instruction is made more complete at each stage until it exits the execution pipeline and is retired(SONY VGP-BPS10A/B battery).
Pipelining does, however, introduce the possibility for a situation where the result of the previous operation is needed to complete the next operation; a condition often termed data dependency conflict. To cope with this, additional care must be taken to check for these sorts of conditions and delay a portion of the instruction pipeline if this occurs(SONY VGP-BPS10/B battery). Naturally, accomplishing this requires additional circuitry, so pipelined processors are more complex than subscalar ones (though not very significantly so). A pipelined processor can become very nearly scalar, inhibited only by pipeline stalls (an instruction spending more than one clock cycle in a stage).
Simple superscalar pipeline. By fetching and dispatching two instructions at a time, a maximum of two instructions per cycle can be completed(SONY VGP-BPS10/S battery).
Further improvement upon the idea of instruction pipelining led to the development of a method that decreases the idle time of CPU components even further. Designs that are said to be superscalar include a long instruction pipeline and multiple identical execution units.[14] In a superscalar pipeline, multiple instructions are read and passed to a dispatcher(SONY Vaio VGN-SR11M Battery), which decides whether or not the instructions can be executed in parallel (simultaneously). If so they are dispatched to available execution units, resulting in the ability for several instructions to be executed simultaneously. In general, the more instructions a superscalar CPU is able to dispatch simultaneously to waiting execution units, the more instructions will be completed in a given cycle(SONY Vaio VGN-SR12G/B Battery).
Most of the difficulty in the design of a superscalar CPU architecture lies in creating an effective dispatcher. The dispatcher needs to be able to quickly and correctly determine whether instructions can be executed in parallel, as well as dispatch them in such a way as to keep as many execution units busy as possible(SONY Vaio VGN-SR12G/P Battery). This requires that the instruction pipeline is filled as often as possible and gives rise to the need in superscalar architectures for significant amounts of CPU cache. It also makes hazard-avoiding techniques like branch prediction, speculative execution, and out-of-order execution crucial to maintaining high levels of performance(SONY Vaio VGN-SR12G/S Battery). By attempting to predict which branch (or path) a conditional instruction will take, the CPU can minimize the number of times that the entire pipeline must wait until a conditional instruction is completed. Speculative execution often provides modest performance increases by executing portions of code that may not be needed after a conditional operation completes(SONY Vaio VGN-SR140E/S Battery). Out-of-order execution somewhat rearranges the order in which instructions are executed to reduce delays due to data dependencies. Also in case of Single Instructions Multiple Data — a case when a lot of data from the same type has to be processed, modern processors can disable parts of the pipeline so that when a single instruction is executed many times(SONY Vaio VGN-SR165E/B Battery), the CPU skips the fetch and decode phases and thus greatly increases performance on certain occasions, especially in highly monotonous program engines such as video creation software and photo processing.
In the case where a portion of the CPU is superscalar and part is not, the part which is not suffers a performance penalty due to scheduling stalls. The Intel P5 Pentium had two superscalar ALUs which could accept one instruction per clock each(SONY Vaio VGN-SR165E/P Battery), but its FPU could not accept one instruction per clock. Thus the P5 was integer superscalar but not floating point superscalar. Intel's successor to the P5 architecture, P6, added superscalar capabilities to its floating point features, and therefore afforded a significant increase in floating point instruction performance(SONY Vaio VGN-SR165E/S Battery).
Both simple pipelining and superscalar design increase a CPU's ILP by allowing a single processor to complete execution of instructions at rates surpassing one instruction per cycle (IPC).[15] Most modern CPU designs are at least somewhat superscalar, and nearly all general purpose CPUs designed in the last decade are superscalar(Sony VAIO VGN-SR175N/B battery). In later years some of the emphasis in designing high-ILP computers has been moved out of the CPU's hardware and into its software interface, or ISA. The strategy of the very long instruction word (VLIW) causes some ILP to become implied directly by the software, reducing the amount of work the CPU must perform to boost ILP and thereby reducing the design's complexity(Sony VAIO VGN-SR19VN battery).
[edit]Thread-level parallelism
Another strategy of achieving performance is to execute multiple programs or threads in parallel. This area of research is known as parallel computing. In Flynn's taxonomy, this strategy is known as Multiple Instructions-Multiple Data or MIMD.
One technology used for this purpose was multiprocessing (MP). The initial flavor of this technology is known as symmetric multiprocessing (SMP) (Sony VAIO VGN-SR19XN battery), where a small number of CPUs share a coherent view of their memory system. In this scheme, each CPU has additional hardware to maintain a constantly up-to-date view of memory. By avoiding stale views of memory, the CPUs can cooperate on the same program and programs can migrate from one CPU to another. To increase the number of cooperating CPUs beyond a handful(Sony VAIO VGN-SR21M/S battery), schemes such as non-uniform memory access (NUMA) and directory-based coherence protocols were introduced in the 1990s. SMP systems are limited to a small number of CPUs while NUMA systems have been built with thousands of processors. Initially, multiprocessing was built using multiple discrete CPUs and boards to implement the interconnect between the processors(Sony VAIO VGN-SR220J/B battery). When the processors and their interconnect are all implemented on a single silicon chip, the technology is known as a multi-core microprocessor.
It was later recognized that finer-grain parallelism existed with a single program. A single program might have several threads (or functions) that could be executed separately or in parallel(Sony VAIO VGN-SR220J/H battery). Some of the earliest examples of this technology implemented input/output processing such as direct memory access as a separate thread from the computation thread. A more general approach to this technology was introduced in the 1970s when systems were designed to run multiple computation threads in parallel(Sony VAIO VGN-SR23H/B battery). This technology is known as multi-threading (MT). This approach is considered more cost-effective than multiprocessing, as only a small number of components within a CPU is replicated in order to support MT as opposed to the entire CPU in the case of MP. In MT, the execution units and the memory system including the caches are shared among multiple threads(Sony VAIO VGN-SR240J/B battery). The downside of MT is that the hardware support for multithreading is more visible to software than that of MP and thus supervisor software like operating systems have to undergo larger changes to support MT. One type of MT that was implemented is known as block multithreading, where one thread is executed until it is stalled waiting for data to return from external memory(Sony VAIO VGN-SR240N/B battery). In this scheme, the CPU would then quickly switch to another thread which is ready to run, the switch often done in one CPU clock cycle, such as the UltraSPARC Technology. Another type of MT is known as simultaneous multithreading, where instructions of multiple threads are executed in parallel within one CPU clock cycle(Sony VAIO VGN-SR25G/B battery).
For several decades from the 1970s to early 2000s, the focus in designing high performance general purpose CPUs was largely on achieving high ILP through technologies such as pipelining, caches, superscalar execution, out-of-order execution, etc. This trend culminated in large, power-hungry CPUs such as the Intel Pentium 4(Sony VAIO VGN-SR25G/P battery). By the early 2000s, CPU designers were thwarted from achieving higher performance from ILP techniques due to the growing disparity between CPU operating frequencies and main memory operating frequencies as well as escalating CPU power dissipation owing to more esoteric ILP techniques(Sony VAIO VGN-SR25G/S battery).
CPU designers then borrowed ideas from commercial computing markets such as transaction processing, where the aggregate performance of multiple programs, also known as throughput computing, was more important than the performance of a single thread or program(Sony VAIO VGN-SR25M/B battery).
This reversal of emphasis is evidenced by the proliferation of dual and multiple core CMP (chip-level multiprocessing) designs and notably, Intel's newer designs resembling its less superscalar P6 architecture. Late designs in several processor families exhibit CMP, including the x86-64 Opteron and Athlon 64 X2, the SPARC UltraSPARC T1(Sony VAIO VGN-SR25S/B battery), IBM POWER4 and POWER5, as well as several video game console CPUs like the Xbox 360's triple-core PowerPC design, and the PS3's 7-core Cell microprocessor.
[edit]Data parallelism
Main articles: Vector processor and SIMD
A less common but increasingly important paradigm of CPUs (and indeed, computing in general) deals with data parallelism. The processors discussed earlier are all referred to as some type of scalar device.[16] As the name implies, vector processors deal with multiple pieces of data in the context of one instruction(Sony VAIO VGN-SR25T/P battery). This contrasts with scalar processors, which deal with one piece of data for every instruction. Using Flynn's taxonomy, these two schemes of dealing with data are generally referred to as SIMD (single instruction, multiple data) and SISD (single instruction, single data), respectively(Sony VAIO VGN-SR25T/S battery). The great utility in creating CPUs that deal with vectors of data lies in optimizing tasks that tend to require the same operation (for example, a sum or a dot product) to be performed on a large set of data. Some classic examples of these types of tasks are multimedia applications (images, video, and sound), as well as many types of scientific and engineering tasks(Sony VAIO VGN-SR26/B battery). Whereas a scalar CPU must complete the entire process of fetching, decoding, and executing each instruction and value in a set of data, a vector CPU can perform a single operation on a comparatively large set of data with one instruction. Of course, this is only possible when the application tends to require many steps which apply one operation to a large set of data(Sony VAIO VGN-SR26/P battery).
Most early vector CPUs, such as the Cray-1, were associated almost exclusively with scientific research and cryptography applications. However, as multimedia has largely shifted to digital media, the need for some form of SIMD in general-purpose CPUs has become significant. Shortly after inclusion of floating point execution units started to become commonplace in general-purpose processors(Sony VAIO VGN-SR26/S battery), specifications for and implementations of SIMD execution units also began to appear for general-purpose CPUs. Some of these early SIMD specifications like HP's Multimedia Acceleration eXtensions (MAX) and Intel's MMX were integer-only. This proved to be a significant impediment for some software developers, since many of the applications that benefit from SIMD primarily deal with floating point numbers(Sony VAIO VGN-SR27TN/B battery). Progressively, these early designs were refined and remade into some of the common, modern SIMD specifications, which are usually associated with one ISA. Some notable modern examples are Intel's SSE and the PowerPC-related AltiVec (also known as VMX).[17]
[edit]Performance
Main article: computer performance
The performance or speed of a processor depends on the clock rate (generally given in multiples of hertz) and the instructions per clock (IPC) (Sony VAIO VGN-SR28/B battery), which together are the factors for the instructions per second (IPS) that the CPU can perform.[18] Many reported IPS values have represented "peak" execution rates on artificial instruction sequences with few branches, whereas realistic workloads consist of a mix of instructions and applications(Sony VAIO VGN-SR28/J battery), some of which take longer to execute than others. The performance of the memory hierarchy also greatly affects processor performance, an issue barely considered in MIPS calculations. Because of these problems, various standardized tests, often called "benchmarks" for this purpose—such as SPECint -- have been developed to attempt to measure the real effective performance in commonly used applications(Sony VAIO VGN-SR28/Q battery).
Processing performance of computers is increased by using multi-core processors, which essentially is plugging two or more individual processors (called cores in this sense) into one integrated circuit.[19] Ideally, a dual core processor would be nearly twice as powerful as a single core processor. In practice, however, the performance gain is far less, only about 50%,[19] due to imperfect software algorithms and implementation(Sony VAIO VGN-SR29VN/S battery).
Subscribe to:
Posts (Atom)