Kinect is a discontinued line of motion sensing input devices produced by Microsoft and first released in 2010. The devices generally contain RGB cameras, and infrared projectors and detectors that map depth through either structured light or time of flight calculations, which can in turn be used to perform real-time gesture recognition and body skeletal detection, among other capabilities. They also contain microphones that can be used for speech recognition and voice control. Kinect was originally developed as a motion controller peripheral for Xbox video game consoles, distinguished from competitors (such as Nintendo's Wii Remote and Sony's PlayStation Move) by not requiring physical controllers. The first-generation Kinect was based on technology from Israeli company PrimeSense, and unveiled at E3 2009 as a peripheral for Xbox 360 codenamed "Project Natal". It was first released on November 4, 2010, and would go on to sell eight million units in its first 60 days of availability. The majority of the games developed for Kinect were casual, family-oriented titles, which helped to attract new audiences to Xbox 360, but did not result in wide adoption by the console's existing, overall userbase. As part of the 2013 unveiling of Xbox 360's successor, Xbox One, Microsoft unveiled a second-generation version of Kinect with improved tracking capabilities. Microsoft also announced that Kinect would be a required component of the console, and that it would not function unless the peripheral is connected. The requirement proved controversial among users and critics due to privacy concerns, prompting Microsoft to backtrack on the decision. However, Microsoft still bundled the new Kinect with Xbox One consoles upon their launch in November 2013. A market for Kinect-based games still did not emerge after the Xbox One's launch; Microsoft would later offer Xbox One hardware bundles without Kinect included, and later revisions of the console removed the dedicated ports used to connect it (requiring a powered USB adapter instead). Microsoft ended production of Kinect for Xbox One in October 2017. Kinect has also been used as part of non-game applications in academic and commercial environments, as it was cheaper and more robust than other depth-sensing technologies at the time. While Microsoft initially objected to such applications, it later released software development kits (SDKs) for the development of Microsoft Windows applications that use Kinect. In 2020, Microsoft released Azure Kinect as a continuation of the technology integrated with the Microsoft Azure cloud computing platform. Part of the Kinect technology was also used within Microsoft's HoloLens project. Microsoft discontinued the Azure Kinect developer kits in October 2023. == History == === Development === The origins of the Kinect started around 2005, at a point where technology vendors were starting to develop depth-sensing cameras. Microsoft had been interested in a 3D camera for the Xbox line earlier but because the technology had not been refined, had placed it in the "Boneyard", a collection of possible technology they could not immediately work on. In 2005, Israeli company PrimeSense was founded by mathematicians and engineers to develop the "next big thing" for video games, incorporating cameras that were capable of mapping a human body in front of them and sensing hand motions. They showed off their system at the 2006 Game Developers Conference, where Microsoft's Alex Kipman, the general manager of hardware incubation, saw the potential in PrimeSense's technology for the Xbox system. Microsoft began discussions with PrimeSense about what would need to be done to make their product more consumer-friendly: not only improvements in the capabilities of depth-sensing cameras, but a reduction in size and cost, and a means to manufacture the units at scale was required. PrimeSense spent the next few years working at these improvements. Nintendo released the Wii in November 2006. The Wii's central feature was the Wii Remote, a handheld device that was detected by the Wii through a motion sensor bar mounted onto a television screen to enable motion controlled games. Microsoft felt pressure from the Wii, and began looking into depth-sensing in more detail with PrimeSense's hardware, but could not get to the level of motion tracking they desired. While they could determine hand gestures, and sense the general shape of a body, they could not do skeletal tracking. A separate path within Microsoft looked to create an equivalent of the Wii Remote, considering that this type of unit may become standardized similar to how two-thumbstick controllers became a standard feature. However, it was still ultimately Microsoft's goal to remove any device between the player and the Xbox. Kudo Tsunoda and Darren Bennett joined Microsoft in 2008, and began working with Kipman on a new approach to depth-sensing aided by machine learning to improve skeletal tracking. They internally demonstrated this and established where they believed the technology could be in a few years, which led to the strong interest to fund further development of the technology; this has also occurred at a time that Microsoft executives wanted to abandon the Wii-like motion tracking approach, and favored the depth-sensing solution to present a product that went beyond the Wii's capabilities. The project was greenlit by late 2008 with work started in 2009. The project was codenamed "Project Natal" after the Brazilian city Natal, Kipman's birthplace. Additionally, Kipman recognized the Latin origins of the word "natal" to mean "to be born", reflecting the new types of audiences they hoped to draw with the technology. Much of the initial work was related to ethnographic research to see how video game players' home environments were laid out, lit, and how those with Wiis used the system to plan how Kinect units would be used. The Microsoft team discovered from this research that the up-and-down angle of the depth-sensing camera would either need to be adjusted manually, or would require an expensive motor to move automatically. Upper management at Microsoft opted to include the motor despite the increased cost to avoid breaking game immersion. Kinect project work also involved packaging the system for mass production and optimizing its performance. Hardware development took around 22 months. During hardware development, Microsoft engaged with software developers to use Kinect. Microsoft wanted to make games that would be playable by families since Kinect could sense multiple bodies in front of it. One of the first internal titles developed for the device was the pack-in game Kinect Adventures developed by Good Science Studio that was part of Microsoft Studios. One of the game modes of Kinect Adventures was "Reflex Ridge", based on the Japanese Brain Wall game where players attempt to contort their bodies in a short time to match cutouts of a wall moving at them. This type of game was a key example of the type of interactivity they wanted with Kinect, and its development helped feed into the hardware improvements. Another development was Project Milo, a prototype game developed by Lionhead Studios led by Peter Molyneux where the player could interact with a virtual avatar through motion controls and voice recognition. Lionhead had developed the project based on original capabilities of the Kinect, but according to Molyneux, Microsoft had found that a consumer-grade version of the Kinect would cost thousands of dollars, so they scaled back the device and refocused the role of games for the Kinect to be more casual games as seen on the Wii. As a result, Project Milo no longer fit Microsoft's portfolio and was cancelled. Nearing the planned release, there was a problem of widespread testing of Kinect in various room types and different bodies accounting for age, gender, and race among other factors, while keeping the details of the unit confidential. Microsoft engaged in a company-wide program offering employees to take home Kinect units to test them. Microsoft also brought other non-gaming divisions, including its Microsoft Research, Microsoft Windows, and Bing teams to help complete the system. Microsoft established its own large-scale manufacturing facility to bulk product Kinect units and test them. === Introduction === Kinect was first announced to the public as "Project Natal" on June 1, 2009, during Microsoft's press conference at E3 2009; film director Steven Spielberg joined Microsoft's Don Mattrick to introduce the technology and its potential. Three demos were presented during the conference—Microsoft's Ricochet and Paint Party, and Lionhead Studios' Milo & Kate created by Peter Molyneux—while a Project Natal-enabled version of Criterion Games' Burnout Paradise was shown during the E3 exhibition. By E3 2009, the skeletal mapping technology was capable of simultaneously tracking four people, with a feature extraction of 4
Stochastic parrot
In machine learning, the term stochastic parrot is a metaphor that frames large language models as systems that statistically mimic text without real understanding. The word "stochastic" – from the ancient Greek "στοχαστικός" (stokhastikos, 'based on guesswork') – is a term from probability theory meaning "randomly determined". The word "parrot" refers to parrots' ability to mimic human speech. The term was introduced in a 2021 paper on AI ethics titled "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜" and authored by Timnit Gebru, Emily M. Bender, Angelina McMillan-Major, and Margaret Mitchell. The paper outlined possible risks associated with large language models (LLMs). In December 2020, it was the subject of a workplace dispute between Gebru (then co-leader of Google's Ethical Artificial Intelligence Team) and Google, which had requested the retraction of the paper. The incident culminated in Gebru's controversial departure from the company. The paper was later presented at the 2021 ACM Conference, and the term "stochastic parrot" has seen widespread use in academic research concerning generative AI and LLMs. The term has been interpreted negatively as an insult towards AI. == Background == Timnit Gebru is an AI ethics researcher, Emily M. Bender is a linguist specializing in computational linguistics, and Margaret Mitchell is a computer scientist specializing in algorithmic bias. Gebru had joined Google in 2018, where she co-led a team on the ethics of artificial intelligence with Mitchell. In late 2020, the paper "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜" was co-written by Gebru and five other researchers, four of whom were Google employees. The paper argues that large language models (LLMs) present significant risks such as environmental and financial costs, inscrutability leading to unknown dangerous biases, and potential for deception as LLMs do not understand the concepts underlying what they learn. The paper states that LLMs are "stitching together sequences of linguistic forms ... observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning." Therefore, they are labeled "stochastic parrots". === Dismissal of Gebru by Google === After the paper was submitted for consideration to the 2021 ACM Conference, Google requested that Gebru either retract the paper from the conference or remove the names of Google employees from it. Gebru refused to do so without further discussion, and emailed Google Research vice president Megan Kacholia that if the company could not explain the request for retraction and address other concerns regarding similar projects, she would plan to resign after a transition period, stating that they could "work on a last date". The following day, on December 2, 2020, Gebru received an email saying that Google was "accepting her resignation". Her abrupt firing sparked protests by Google employees and negative publicity for the company. == Usage == The phrase has been used by AI skeptics to signify that LLMs lack understanding of the meaning of their outputs. Sam Altman, CEO of OpenAI, used the term shortly after the release of ChatGPT in December 2022, tweeting "i am a stochastic parrot, and so r u". The term was nominated as the 2023 AI-related Word of the Year by the American Dialect Society. == Debate == Some LLMs, such as ChatGPT, have become capable of interacting with users in convincingly human-like conversations. The development of these new systems has deepened the discussion of the extent to which LLMs understand or are simply "parroting". According to machine learning researchers Lindholm, Wahlström, Lindsten, and Schön, the term "stochastic parrot" highlights two vital limitations of LLMs: LLMs are limited by the data they are trained on and are simply stochastically repeating contents of datasets. Because they are just making up outputs based on training data, LLMs do not understand if they are saying something incorrect or inappropriate. Lindholm et al. noted that, with poor quality datasets and other limitations, a learning machine might produce results that are "dangerously wrong". === Subjective experience === In the mind of a human being, words and language correspond to things one has experienced. For LLMs, according to proponents of the theory, words correspond only to other words and patterns of usage fed into their training data. Proponents of the idea of stochastic parrots thus conclude that statements about LLMs are due to "the human tendency to attribute meaning to text", and claim this occurs despite the LLMs not actually understanding language. === Fine-tuning === Kelsey Piper argued that the claim that LLMs are stochastic parrots or mere "next-token predictors" focuses on pre-training, ignoring that modern LLMs are also fine-tuned to follow instructions and to prefer accurate answers. === Hallucinations and mistakes === The tendency of LLMs to pass off false information as fact is held as support. Called hallucinations or confabulations, LLMs will occasionally synthesize information that matches some pattern. LLMs may fail to distinguish fact and fiction, which leads to the claim that they can't connect words to a comprehension of the world, as humans do. Furthermore, LLMs may fail to decipher complex or ambiguous grammar cases that rely on understanding the meaning of language. For example: The wet newspaper that fell down off the table is my favorite newspaper. But now that my favorite newspaper fired the editor I might not like reading it anymore. Can I replace 'my favorite newspaper' by 'the wet newspaper that fell down off the table' in the second sentence? GPT-4, an LLM released in March 2023, responded yes, not understanding that the meaning of "newspaper" is different in these two contexts; it is first an object and second an institution. === Benchmarks and experiments === One argument against the hypothesis that LLMs are stochastic parrot is their results on benchmarks for reasoning, common sense and language understanding. In 2023, some LLMs have shown good results on many language understanding tests, such as the Super General Language Understanding Evaluation (SuperGLUE). GPT-4 scored in the >90th-percentile on the Uniform Bar Examination and achieved 93% accuracy on the MATH benchmark of high-school Olympiad problems, results that exceed rote pattern-matching expectations. Such tests, and the smoothness of many LLM responses, help as many as 51% of AI professionals believe they can truly understand language with enough data, according to a 2022 survey. === Expert rebuttals === Some AI researchers dispute the notion that LLMs merely "parrot" their training data. Geoffrey Hinton, a pioneering figure in neural networks, counters that the metaphor misunderstands the prerequisite for accurate language prediction. He argues that "to predict the next word accurately, you have to understand the sentence", a view he presented on 60 Minutes in 2023. From this perspective, understanding is not an alternative to statistical prediction, but rather an emergent property required to perform it effectively at scale. Hinton also uses logical puzzles to demonstrate that LLMs actually understand language. A 2024 Scientific American investigation described a closed Berkeley workshop where state-of-the-art models solved novel tier-4 mathematics problems and produced coherent proofs, indicating reasoning abilities beyond memorization. The GPT-4 Technical Report showed human-level results on professional and academic exams (e.g., the Uniform Bar Exam and USMLE), challenging the "parrot" characterization. Anthropic conducted mechanistic interpretability research on Claude, using attribution graphs to identify circuits. The research showed how the LLM processes information via chains of fuzzy logical inference, and indicated an ability to plan ahead. They found that Claude 3.5 Haiku "employs remarkably general abstractions", forms "internally generated plans for its future outputs" and "works backwards from its longer-term goals". They noted that "The mechanisms of the model can apparently only be faithfully described using an overwhelmingly large causal graph." They also found that the model includes "mechanisms that could underlie a simple form of metacognition", in that it "thinks about" the level of its own knowledge before reaching its answer. === Interpretability === Another line of evidence against the 'stochastic parrot' claim comes from mechanistic interpretability, a research field dedicated to reverse-engineering LLMs to understand their internal workings. Rather than only observing the model's input-output behavior, these techniques probe the model's internal activations, which can be used to determine if they contain structured representations of the world. The goal is to investigate whether LLMs are merely manipulating surface statistics or if t
Stochastic block model
The stochastic block model is a generative model for random graphs. This model tends to produce graphs containing communities, subsets of nodes characterized by being connected with one another with particular edge densities. For example, edges may be more common within communities than between communities. Its mathematical formulation was first introduced in 1983 in the field of social network analysis by Paul W. Holland et al. The stochastic block model is important in statistics, machine learning, and network science, where it serves as a useful benchmark for the task of recovering community structure in graph data. == Definition == The stochastic block model takes the following parameters: The number n {\displaystyle n} of vertices; a partition of the vertex set { 1 , … , n } {\displaystyle \{1,\ldots ,n\}} into disjoint subsets C 1 , … , C r {\displaystyle C_{1},\ldots ,C_{r}} , called communities; a symmetric r × r {\displaystyle r\times r} matrix P {\displaystyle P} of edge probabilities. The edge set is then sampled at random as follows: any two vertices u ∈ C i {\displaystyle u\in C_{i}} and v ∈ C j {\displaystyle v\in C_{j}} are connected by an edge with probability P i j {\displaystyle P_{ij}} . An example problem is: given a graph with n {\displaystyle n} vertices, where the edges are sampled as described, recover the groups C 1 , … , C r {\displaystyle C_{1},\ldots ,C_{r}} . == Special cases == If the probability matrix is a constant, in the sense that P i j = p {\displaystyle P_{ij}=p} for all i , j {\displaystyle i,j} , then the result is the Erdős–Rényi model G ( n , p ) {\displaystyle G(n,p)} . This case is degenerate—the partition into communities becomes irrelevant—but it illustrates a close relationship to the Erdős–Rényi model. The planted partition model is the special case that the values of the probability matrix P {\displaystyle P} are a constant p {\displaystyle p} on the diagonal and another constant q {\displaystyle q} off the diagonal. Thus two vertices within the same community share an edge with probability p {\displaystyle p} , while two vertices in different communities share an edge with probability q {\displaystyle q} . Sometimes it is this restricted model that is called the stochastic block model. The case where p > q {\displaystyle p>q} is called an assortative model, while the case p < q {\displaystyle p P j k {\displaystyle P_{ii}>P_{jk}} whenever j ≠ k {\displaystyle j\neq k} : all diagonal entries dominate all off-diagonal entries. A model is called weakly assortative if P i i > P i j {\displaystyle P_{ii}>P_{ij}} whenever i ≠ j {\displaystyle i\neq j} : each diagonal entry is only required to dominate the rest of its own row and column. Disassortative forms of this terminology exist, by reversing all inequalities. For some algorithms, recovery might be easier for block models with assortative or disassortative conditions of this form. == Typical statistical tasks == Much of the literature on algorithmic community detection addresses three statistical tasks: detection, partial recovery, and exact recovery. === Detection === The goal of detection algorithms is simply to determine, given a sampled graph, whether the graph has latent community structure. More precisely, a graph might be generated, with some known prior probability, from a known stochastic block model, and otherwise from a similar Erdos-Renyi model. The algorithmic task is to correctly identify which of these two underlying models generated the graph. === Partial recovery === In partial recovery, the goal is to approximately determine the latent partition into communities, in the sense of finding a partition that is correlated with the true partition significantly better than a random guess. === Exact recovery === In exact recovery, the goal is to recover the latent partition into communities exactly. The community sizes and probability matrix may be known or unknown. == Statistical lower bounds and threshold behavior == Stochastic block models exhibit a sharp threshold effect reminiscent of percolation thresholds. Suppose that we allow the size n {\displaystyle n} of the graph to grow, keeping the community sizes in fixed proportions. If the probability matrix remains fixed, tasks such as partial and exact recovery become feasible for all non-degenerate parameter settings. However, if we scale down the probability matrix at a suitable rate as n {\displaystyle n} increases, we observe a sharp phase transition: for certain settings of the parameters, it will become possible to achieve recovery with probability tending to 1, whereas on the opposite side of the parameter threshold, the probability of recovery tends to 0 no matter what algorithm is used. For partial recovery, the appropriate scaling is to take P i j = P ~ i j / n {\displaystyle P_{ij}={\tilde {P}}_{ij}/n} for fixed P ~ {\displaystyle {\tilde {P}}} , resulting in graphs of constant average degree. In the case of two equal-sized communities, in the assortative planted partition model with probability matrix P = ( p ~ / n q ~ / n q ~ / n p ~ / n ) , {\displaystyle P=\left({\begin{array}{cc}{\tilde {p}}/n&{\tilde {q}}/n\\{\tilde {q}}/n&{\tilde {p}}/n\end{array}}\right),} partial recovery is feasible with probability 1 − o ( 1 ) {\displaystyle 1-o(1)} whenever ( p ~ − q ~ ) 2 > 2 ( p ~ + q ~ ) {\displaystyle ({\tilde {p}}-{\tilde {q}})^{2}>2({\tilde {p}}+{\tilde {q}})} , whereas any estimator fails partial recovery with probability 1 − o ( 1 ) {\displaystyle 1-o(1)} whenever ( p ~ − q ~ ) 2 < 2 ( p ~ + q ~ ) {\displaystyle ({\tilde {p}}-{\tilde {q}})^{2}<2({\tilde {p}}+{\tilde {q}})} . For exact recovery, the appropriate scaling is to take P i j = P ~ i j log n / n {\displaystyle P_{ij}={\tilde {P}}_{ij}\log n/n} , resulting in graphs of logarithmic average degree. Here a similar threshold exists: for the assortative planted partition model with r {\displaystyle r} equal-sized communities, the threshold lies at p ~ − q ~ = r {\displaystyle {\sqrt {\tilde {p}}}-{\sqrt {\tilde {q}}}={\sqrt {r}}} . In fact, the exact recovery threshold is known for the fully general stochastic block model. == Algorithms == In principle, exact recovery can be solved in its feasible range using maximum likelihood, but this amounts to solving a constrained or regularized cut problem such as minimum bisection that is typically NP-complete. Hence, no known efficient algorithms will correctly compute the maximum-likelihood estimate in the worst case. However, a wide variety of algorithms perform well in the average case, and many high-probability performance guarantees have been proven for algorithms in both the partial and exact recovery settings. Successful algorithms include spectral clustering of the vertices, semidefinite programming, forms of belief propagation, and community detection among others. == Variants == Several variants of the model exist. One minor tweak allocates vertices to communities randomly, according to a categorical distribution, rather than in a fixed partition. More significant variants include the degree-corrected stochastic block model, the hierarchical stochastic block model, the geometric block model, censored block model and the mixed-membership block model. == Topic models == Stochastic block model have been recognised to be a topic model on bipartite networks. In a network of documents and words, Stochastic block model can identify topics: group of words with a similar meaning. == Extensions to signed graphs == Signed graphs allow for both favorable and adverse relationships and serve as a common model choice for various data analysis applications, e.g., correlation clustering. The stochastic block model can be trivially extended to signed graphs by assigning both positive and negative edge weights or equivalently using a difference of adjacency matrices of two stochastic block models. == DARPA/MIT/AWS Graph Challenge: streaming stochastic block partition == GraphChallenge encourages community approaches to developing new solutions for analyzing graphs and sparse data derived from social media, sensor feeds, and scientific data to enable relationships between events to be discovered as they unfold in the field. Streaming stochastic block partition is one of the challenges since 2017. Spectral clustering has demonstrated outstanding performance compared to the original and even improved base algorithm, matching its quality of clusters while being multiple orders of magnitude faster.
Relation network
A relation network (RN) is an artificial neural network component with a structure that can reason about relations among objects. An example category of such relations is spatial relations (above, below, left, right, in front of, behind). RNs can infer relations, they are data efficient, and they operate on a set of objects without regard to the objects' order. == History == In June 2017, DeepMind announced the first relation network. It claimed that the technology had achieved "superhuman" performance on multiple question-answering problem sets. == Design == RNs constrain the functional form of a neural network to capture the common properties of relational reasoning. These properties are explicitly added to the system, rather than established by learning just as the capacity to reason about spatial, translation-invariant properties is explicitly part of convolutional neural networks (CNN). The data to be considered can be presented as a simple list or as a directed graph whose nodes are objects and whose edges are the pairs of objects whose relationships are to be considered. The RN is a composite function: R N ( O ) = f ϕ ( ∑ i , j g θ ( o i , o j , q ) ) , {\displaystyle RN\left(O\right)=f_{\phi }\left(\sum _{i,j}g_{\theta }\left(o_{i},o_{j},q\right)\right),} where the input is a set of "objects" O = { o 1 , o 2 , . . . , o n } , o i ∈ R m {\displaystyle O=\left\lbrace o_{1},o_{2},...,o_{n}\right\rbrace ,o_{i}\in \mathbb {R} ^{m}} is the ith object, and fφ and gθ are functions with parameters φ and θ, respectively and q is the question. fφ and gθ are multilayer perceptrons, while the 2 parameters are learnable synaptic weights. RNs are differentiable. The output of gθ is a "relation"; therefore, the role of gθ is to infer any ways in which two objects are related. Image (128x128 pixel) processing is done with a 4-layer CNN. Outputs from the CNN are treated as the objects for relation analysis, without regard for what those "objects" explicitly represent. Questions were processed with a long short-term memory network.
Universal approximation theorem
In the field of machine learning, the universal approximation theorems (UATs) state that neural networks with a certain structure can, in principle, approximate any continuous function to any desired degree of accuracy. These theorems provide a mathematical justification for using neural networks, assuring researchers that a sufficiently large or deep network can model the complex, non-linear relationships often found in real-world data. The best-known version of the theorem applies to feedforward networks with a single hidden layer. It states that if the layer's activation function is non-polynomial (which is true for common choices like the sigmoid function or ReLU), then the network can act as a "universal approximator." Universality is achieved by increasing the number of neurons in the hidden layer, making the network "wider." Other versions of the theorem show that universality can also be achieved by keeping the network's width fixed but increasing its number of layers, making it "deeper." These are existence theorems. They guarantee that a network with the right structure exists, but they do not provide a method for finding the network's parameters (training it), nor do they specify exactly how large the network must be for a given function. Finding a suitable network remains a practical challenge that is typically addressed with optimization algorithms like backpropagation. == Setup == Artificial neural networks are combinations of multiple simple mathematical functions that implement more complicated functions from (typically) real-valued vectors to real-valued vectors. The spaces of multivariate functions that can be implemented by a network are determined by the structure of the network, the set of simple functions, and its multiplicative parameters. A great deal of theoretical work has gone into characterizing these function spaces. Most universal approximation theorems are in one of two classes. The first quantifies the approximation capabilities of neural networks with an arbitrary number of artificial neurons ("arbitrary width" case) and the second focuses on the case with an arbitrary number of hidden layers, each containing a limited number of artificial neurons ("arbitrary depth" case). In addition to these two classes, there are also universal approximation theorems for neural networks with bounded number of hidden layers and a limited number of neurons in each layer ("bounded depth and bounded width" case). == History == === Arbitrary width === The first results concerned the arbitrary width case. Ken-ichi Funahashi (May 1989) showed that Rumelhart–Hinton–Williams type backpropagation networks possess universal approximation capability with a class of sigmoidal activation functions, extending the result to multi-output mappings as well. Kurt Hornik, Maxwell Stinchcombe, and Halbert White (July 1989) showed that multilayer feed-forward networks with as few as one hidden layer are universal approximators, provided that the activation function satisfies certain conditions. George Cybenko (December 1989) independently established a related result for sigmoid activation functions using functional-analytic methods. Hornik also showed in 1991 that it is not the specific choice of the activation function but rather the multilayer feed-forward architecture itself that gives neural networks the potential of being universal approximators. Moshe Leshno et al in 1993 and later Allan Pinkus in 1999 showed that the universal approximation property is equivalent to having a nonpolynomial activation function. === Arbitrary depth === The arbitrary depth case was also studied by a number of authors such as Gustaf Gripenberg in 2003, Dmitry Yarotsky, Zhou Lu et al in 2017, Boris Hanin and Mark Sellke in 2018 who focused on neural networks with ReLU activation function. In 2020, Patrick Kidger and Terry Lyons extended those results to neural networks with general activation functions such, e.g. tanh or GeLU. One special case of arbitrary depth is that each composition component comes from a finite set of mappings. In 2024, Cai constructed a finite set of mappings, named a vocabulary, such that any continuous function can be approximated by compositing a sequence from the vocabulary. This is similar to the concept of compositionality in linguistics, which is the idea that a finite vocabulary of basic elements can be combined via grammar to express an infinite range of meanings. === Bounded depth and bounded width === The bounded depth and bounded width case was first studied by Maiorov and Pinkus in 1999. They showed that there exists an analytic sigmoidal activation function such that two hidden layer neural networks with bounded number of units in hidden layers are universal approximators. In 2018, Guliyev and Ismailov constructed a smooth sigmoidal activation function providing universal approximation property for two hidden layer feedforward neural networks with fewer units in hidden layers. In 2018, they also constructed single hidden layer networks with bounded width that are still universal approximators for univariate functions. However, this does not apply for multivariable functions. In 2022, Shen et al. obtained precise quantitative information on the depth and width required to approximate a target function by deep and wide ReLU neural networks. === Quantitative bounds === The question of minimal possible width for universality was first studied in 2021, Park et al obtained the minimum width required for the universal approximation of Lp functions using feed-forward neural networks with ReLU as activation functions. Similar results that can be directly applied to residual neural networks were also obtained in the same year by Paulo Tabuada and Bahman Gharesifard using control-theoretic arguments. In 2023, Cai obtained the optimal minimum width bound for the universal approximation. For the arbitrary depth case, Leonie Papon and Anastasis Kratsios derived explicit depth estimates depending on the regularity of the target function and of the activation function. === Kolmogorov network === The Kolmogorov–Arnold representation theorem is similar in spirit. Indeed, certain neural network families can directly apply the Kolmogorov–Arnold theorem to yield a universal approximation theorem. Robert Hecht-Nielsen showed that a three-layer neural network can approximate any continuous multivariate function. This was extended to the discontinuous case by Vugar Ismailov. In 2024, Ziming Liu and co-authors showed a practical application. === Reservoir computing and quantum reservoir computing === In reservoir computing a sparse recurrent neural network with fixed weights equipped of fading memory and echo state property is followed by a trainable output layer. Its universality has been demonstrated separately for what concerns networks of rate neurons and spiking neurons, respectively. In 2024, the framework has been generalized and extended to quantum reservoirs where the reservoir is based on qubits defined over Hilbert spaces. === Variants === Variants include discontinuous activation functions, noncompact domains, certifiable networks, random neural networks, and alternative network architectures and topologies. The universal approximation property of width-bounded networks has been studied as a dual of classical universal approximation results on depth-bounded networks. For input dimension d x {\displaystyle d_{x}} and output dimension d y {\displaystyle d_{y}} the minimum width required for the universal approximation of the Lp functions is exactly m a x { d x + 1 , d y } {\displaystyle max\{d_{x}+1,d_{y}\}} (for a ReLU network). More generally this also holds if both ReLU and a threshold activation function are used. Universal function approximation on graphs (or rather on graph isomorphism classes) by popular graph convolutional neural networks (GCNs or GNNs) can be made as discriminative as the Weisfeiler–Leman graph isomorphism test. In 2020, a universal approximation theorem result was established by Brüel-Gabrielsson, showing that graph representation with certain injective properties is sufficient for universal function approximation on bounded graphs and restricted universal function approximation on unbounded graphs, with an accompanying O ( | V | ⋅ | E | ) {\displaystyle {\mathcal {O}}(\left|V\right|\cdot \left|E\right|)} -runtime method that performed at state of the art on a collection of benchmarks (where V {\displaystyle V} and E {\displaystyle E} are the sets of nodes and edges of the graph respectively). There are also a variety of results between non-Euclidean spaces and other commonly used architectures and, more generally, algorithmically generated sets of functions, such as the convolutional neural network (CNN) architecture, radial basis functions, or neural networks with specific properties. == Arbitrary-width case == A universal approximation theorem formally states that a family of neural network funct
Per-pixel lighting
In computer graphics, per-pixel lighting refers to any technique for lighting an image or scene that calculates illumination for each pixel on a rendered image. This is in contrast to other popular methods of lighting such as vertex lighting, which calculates illumination at each vertex of a 3D model and then interpolates the resulting values over the model's faces to calculate the final per-pixel color values. Per-pixel lighting is commonly used with techniques, such as blending, alpha blending, alpha to coverage, anti-aliasing, texture filtering, clipping, hidden-surface determination, Z-buffering, stencil buffering, shading, mipmapping, normal mapping, bump mapping, displacement mapping, parallax mapping, shadow mapping, specular mapping, shadow volumes, high-dynamic-range rendering, ambient occlusion (screen space ambient occlusion, screen space directional occlusion, ray-traced ambient occlusion), ray tracing, global illumination, and tessellation. Each of these techniques provides some additional data about the surface being lit or the scene and light sources that contributes to the final look and feel of the surface. Most modern video game engines implement lighting using per-pixel techniques instead of vertex lighting to achieve increased detail and realism. The id Tech 4 engine, used to develop such games as Brink and Doom 3, was one of the first game engines to implement a completely per-pixel shading engine. All versions of the CryENGINE, Frostbite Engine, and Unreal Engine, among others, also implement per-pixel shading techniques. Deferred shading is a recent development in per-pixel lighting notable for its use in the Frostbite Engine and Battlefield 3. Deferred shading techniques are capable of rendering potentially large numbers of small lights inexpensively (other per-pixel lighting approaches require full-screen calculations for each light in a scene, regardless of size). == History == While only recently have personal computers and video hardware become powerful enough to perform full per-pixel shading in real-time applications such as games, many of the core concepts used in per-pixel lighting models have existed for decades. Frank Crow published a paper describing the theory of shadow volumes in 1977. This technique uses the stencil buffer to specify areas of the screen that correspond to surfaces that lie in a "shadow volume", or a shape representing a volume of space eclipsed from a light source by some object. These shadowed areas are typically shaded after the scene is rendered to buffers by storing shadowed areas with the stencil buffer. Jim Blinn first introduced the idea of normal mapping in a 1978 SIGGRAPH paper. Blinn pointed out that the earlier idea of unlit texture mapping proposed by Edwin Catmull was unrealistic for simulating rough surfaces. Instead of mapping a texture onto an object to simulate roughness, Blinn proposed a method of calculating the degree of lighting a point on a surface should receive based on an established "perturbation" of the normals across the surface. == Hardware rendering == Real-time applications, such as video games, usually implement per-pixel lighting through the use of pixel shaders, allowing the GPU hardware to process the effect. The scene to be rendered is first rasterized onto a number of buffers storing different types of data to be used in rendering the scene, such as depth, normal direction, and diffuse color. Then, the data is passed into a shader and used to compute the final appearance of the scene, pixel-by-pixel. Deferred shading is a per-pixel shading technique that has recently become feasible for games. With deferred shading, a "g-buffer" is used to store all terms needed to shade a final scene on the pixel level. The format of this data varies from application to application depending on the desired effect, and can include normal data, positional data, specular data, diffuse data, emissive maps and albedo, among others. Using multiple render targets, all of this data can be rendered to the g-buffer with a single pass, and a shader can calculate the final color of each pixel based on the data from the g-buffer in a final "deferred pass". Because deferred shading assumes only one visible fragment per pixel sample, transparent objects are generally handled in a separate forward pass. == Software rendering == Per-pixel lighting is also performed in software on many high-end commercial rendering applications which typically do not render at interactive framerates. This is called offline rendering or software rendering. NVidia's mental ray rendering software, which is integrated with such suites as Autodesk's Softimage is a well-known example.
Prescription monitoring program
In the United States, prescription monitoring programs (PMPs) or prescription drug monitoring programs (PDMPs) are state-run programs which collect and distribute data about the prescription and dispensation of federally controlled substances and, depending on state requirements, other potentially abusable prescription drugs. PMPs are meant to help prevent adverse drug-related events such as opioid overdoses, drug diversion, and substance abuse by decreasing the amount and/or frequency of opioid prescribing, and by identifying those patients who are obtaining prescriptions from multiple providers (i.e., "doctor shopping") or those physicians overprescribing opioids. Most US health care workers support the idea of PMPs, which intend to assist physicians, physician assistants, nurse practitioners, dentists and other prescribers, the pharmacists, chemists and support staff of dispensing establishments. The database, whose use is required by State law, typically requires prescribers and pharmacies dispensing controlled substances to register with their respective state PMPs and (for pharmacies and providers who dispense from their offices) to report the dispensation of such prescriptions to an electronic online database. The majority of PMPs are authorized to notify law enforcement agencies or licensing boards or physicians when a prescriber, or patients receiving prescriptions, exceed thresholds established by the state or prescription recipient exceeds thresholds established by the State. All states have implemented PDMPs, although evidence for the effectiveness of these programs is mixed. While prescription of opioids has decreased with PMP use, overdose deaths in many states have actually increased, with those states sharing data with neighboring jurisdictions or requiring reporting of more drugs experiencing highest increases in deaths. This may be because those declined opioid prescriptions turn to street drugs, whose potency and contaminants carry greater overdose risk. == History == Prescription drug monitoring programs, or PDMPs, are an example of one initiative proposed to alleviate effects of the opioid crisis. The programs are designed to restrict prescription drug abuse by limiting a patient's ability to obtain similar prescriptions from multiple providers (i.e. “doctor shopping”) and reducing diversion of controlled substances. This is meant to reduce risk of fatal overdose caused by high doses of opioids or interactions between opioids and benzodiazepenes, and to enable better decision making on the part of healthcare providers who may be unaware of a patient's prescription drug use, history or other prescriptions. PDMPs have been implemented in state legislations since 1939 in California, a time before electronic medical records, though implementation rose alongside increased awareness of overprescribing of opioids and overdose. A later New York state program was struck down by the U.S. Supreme Court in Whalen v. Roe. But, by 2019, 49 states, the District of Columbia, and Guam had enacted PDMP legislation. In 2021 Missouri, the last State to not use a PMP, adopted legislation to create one. PMPs are constantly being updated to increase speed of data collection, sharing of data across States, and ease of interpretation. This is being done by integrating PDMP reports with other health information technologies such as health information exchanges (HIE), electronic health record (EHR) systems, and/ or pharmacy dispensing software systems. One program that has been implemented in nine states is called the PDMP Electronic Health Records Integration and Interoperability Expansion, also known as PEHRIIE. Another software, marketed by Bamboo Health and integrated with PMPs in 43 states, uses an algorithm to track factors thought to increase risk of diversion, abuse or overdose, and assigns patients a three digit score based on presumed indicators of risk. While some studies have suggested that PDMP-HIT integration and sharing of interstate data brings benefits such as reduced opioid-related inpatient morbidity, others have found no or negative impact on mortality compared to states without PMP data sharing. Patient and media reports suggest need for testing and evaluation of algorithmic software used to score risk, with some patients reporting denial of prescriptions without c explanation or clarity of data. == Goals == Most health care workers support PMPs which intend to assist physicians, physician assistants, nurse practitioners, dentists and other prescribers, the pharmacists, chemists and support staff of dispensing establishments, as well as law-enforcement agencies. The collaboration supports the legitimate medical use of controlled substances while limiting their abuse and diversion. Pharmacies dispensing controlled substances and prescribers typically must register with their respective state PMPs and (for pharmacies and providers who dispense controlled substances from their offices) report the dispensation to an electronic online database. Some pharmacy software can submit these reports automatically to multiple states. == Usage == === List of programs by state === === Software systems === NarxCare is a prescription drug monitoring program (PDMP) run by Bamboo Health. Bamboo Health was formerly known as Appriss. It is widely used across the United States by pharmacies including Rite Aid as well as those at Walmart and Sam’s Club. The NarxCare software allows doctors to view data about a patient, combining data from the prescription registries of various U.S. states to make the registries interoperable nationally. It also uses machine learning to generate an "Overdose Risk Score" that potentially includes EMS and criminal justice data; these scores have been criticized by researchers and patient advocates for the lack of transparency in the process as well as the potential for disparate treatment of women and minority groups. Advertised as an "analytics tool and care management platform", the NarxCare software allows doctors to view data about a patient including how many pharmacies they have visited and the combinations of medication they are prescribed. It combines data from the prescription registries of various U.S. states, making the registries interoperable nationally. It additionally uses machine learning to generate various three-digit "risk scores" and an overall "Overdose Risk Score", collectively referred to as Narx Scores, in a process that potentially includes EMS and criminal justice data as well as court records. == Controversy == Many doctors and researchers support the idea of PDMPs as a tool in combatting the opioid epidemic. Opioid prescribing, opioid diversion and supply, opioid misuse, and opioid-related morbidity and mortality are common elements in data entered into PDMPs. Prescription Monitoring Programs are purported to offer economic benefits for the states who implement them by decreasing overall health care costs, lost productivity, and investigation times. However, there are many studies that conclude the impact of PDMPs is unclear. While use of PMPs has been accompanied by decrease in opioid prescribing, few analyses consider corresponding use of street opioids, extramedical use, or diversion, which might provide a more holistic method for evaluation of PMP intent and efficacy. Evidence for PDMP impact on fatal overdoses is decidedly mixed, with multiple studies finding increased overdose rates in some states, decreases in others, or no clear impact. Interestingly, an increase in heroin overdoses after PDMP implementation has been commonly reported, presumably as denial of prescription opioids sends patients in search of street drugs. Narx Scores have been criticized by researchers and patient advocates for the lack of transparency in the generation process as well as the potential for disparate treatment of women and minority groups. Writing in Duke Law Journal, Jennifer Oliva stated that "black-box algorithms" are used to generate the scores.