AhbarjietMalta

By AhbarjietMalta

Ahbarjiet Malta... more

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about AhbarjietMalta:

How many episodes does AhbarjietMalta have?

The podcast currently has 1,812 episodes available.

AhbarjietMalta episodes:

January 05, 2024 Chapter 12: Use Cases in Modern Era—Introduction
Chapter 12: Use Cases in Modern Era—Introduction
—
Today's Amazon Deals - https://amzn.to/3FeoGyg
—-
Chapter 12: Use Cases in Modern Era—Introduction
Introduction
The widespread adoption of artificial intelligence (AI) has had a significant impact on industries across the board. Even non-technical industries are now able to leverage AI to improve their products and business strategies.
...more
1min
January 05, 2024 Points to Remember
Points to Remember
—
Today's Amazon Deals - https://amzn.to/3FeoGyg
—-
Points to Remember
The customer accesses ChatGPT through the website or app interface.
Overall, ChatGPT’s customer journey flow is focused on providing personalized, efficient, and high-quality customer service through advanced NLP and ML technologies.
Join Our Book's Discord Space
Join the book's Discord Workspace for Latest updates, Offers, Tech happenings around the world, New Release and Sessions with the Authors:
https://discord.bpbonline.com
...more
1min
January 05, 2024 Chapter 11: Customer Journey in ChatGPT Free Version UI
Chapter 11: Customer Journey in ChatGPT Free Version UI
—
Today's Amazon Deals - https://amzn.to/3FeoGyg
–
Chapter 11: Customer Journey in ChatGPT Free Version UI
Introduction
This chapter of the book essentially takes users to the journey and absolute usability of ChatGPT as an interface for the free version. The customer journey almost goes like this flowchart below
Discovery: A customer becomes aware of ChatGPT through various channels such as social media, search engines, or word-of-mouth. The customer accesses ChatGPT through the website or app interface. The official UI URL: https://chat.openai.com/chat
↓
User processing: User is supposed to sign up/ login( existing users) inside the GPT UI. The process of sign up or login can be done with 3rd party user email access as well.
Figure 11.1: Pre-signup version of ChatGPT UI
↓
Inquiry: The customer asks a question or initiates a conversation with ChatGPT by typing in a query or using a voice command. The interface to ask or tell your query:
Figure 11.2: Opening interface of ChatGPT
↓
Response: ChatGPT provides a relevant response or suggestion to the customer’s query.
↓
Feedback: The customer gives feedback on the quality of the response or asks for more clarification if needed.
↓
Resolution: ChatGPT resolves the customer’s query or provides further assistance if needed. An example how GPT takes a query, response on it and improvise more on the feedback
Figure 11.3: Demonstration of conversation with sequential improvisation with ChatGPT
↓
Follow-up: ChatGPT may follow-up with the customer after a certain period to ensure the resolution was satisfactory or to provide additional assistance.
↓
Retention: If the customer had a positive experience with ChatGPT, they are more likely to return and use ChatGPT’s services in the future.
↓
Overall, ChatGPT’s customer journey flow is focused on providing personalized, efficient, and high-quality customer service through advanced NLP and ML technologies.
...more
3min
January 04, 2024 Points to Remember
Points to Remember
—
Today's Amazon Deals - https://amzn.to/3FeoGyg
–
Points to Remember
API pricing (as of 02nd Mar, 2023): Though there’s still a free version of ChatGPT available, the API is needed for small to large institutes and individuals to integrate their developments and applications with GPT to utilize the facility at their end.
API pricing of chatGPT and as well the close sibling, GPT -3.5 has become well sustainable and affordable now.
Join Our Book's Discord Space
Join the book's Discord Workspace for Latest updates, Offers, Tech happenings around the world, New Release and Sessions with the Authors:
https://discord.bpbonline.com
...more
2min
January 04, 2024 Technical Limitations of ChatGPT
Technical Limitations of ChatGPT
—
Today's Amazon Deals - https://amzn.to/3FeoGyg
–
Technical Limitations of ChatGPT
Sometimes ChatGPT provides responses that are accurate but are really erroneous or illogical. Fixing this problem is difficult because: (1) there is currently no source of truth during RL training; (2) making the model more cautious makes it decline questions that it can answer correctly; and (3) supervised training deceives the model because the best response depends on the model’s knowledge rather than the demonstrator’s knowledge.
The input phrase can be changed, and ChatGPT is sensitive to repeated attempts at the same question. For instance, the model could claim to not know the answer if the question is phrased one way, but with a simple rewording, they might be able to respond accurately.
The model repeatedly states that it is a language model developed by OpenAI and utilizes other overused words. These problems are caused by biases in the training data (trainers favor lengthier responses that appear more thorough) and well-known over-optimization problems.
When the user provides an uncertain query, the model should ideally offer clarifying questions. Instead, our present models typically make assumptions about what the user meant.
Although we’ve worked to make the model reject unsuitable requests, there are still moments when it’ll take negative instructions or behave inimically. Although we anticipate some false negatives and positives for the time being, we are leveraging the Moderation API to alert users or prohibit specific categories of hazardous material. In order to help us in our continued efforts to enhance this system, we are glad to gather user input.
...more
3min
January 04, 2024 Chapter 10: API Pricing Model and Technical Limitations of ChatGPT
Chapter 10: API Pricing Model and Technical Limitations of ChatGPT
—
Today's Amazon Deals - https://amzn.to/3FeoGyg
–
Chapter 10: API Pricing Model and Technical Limitations of ChatGPT
Introduction
API pricing (as of 02nd Mar, 2023): Though there’s still a free version of ChatGPT available, the API is needed for small to large institutes and individuals to integrate their developments and applications with GPT to utilize the facility at their end. API pricing of chatGPT and as well the close sibling, GPT -3.5 has become well sustainable and affordable now. Here’s the list for current pricing per 1000 tokens (one can consider this as word vector, 1000 tokens can approximately can create an essay of 750 words):
Model
Price / 1000 tokens
gpt-3.5-turbo
$0.002 / 1K tokens
Ada – Fastest
$0.0004 / 1K tokens
Babbage
$0.0005 / 1K tokens
Curie
$0.0020 / 1K tokens
Davinci – the most powerful
$0.0020 / 1K tokens
...more
2min
January 03, 2024 Points to Remember
Points to Remember
—
Today's Amazon Deals - https://amzn.to/3FeoGyg
–
Points to Remember
The GPT-1 was launched in June 2018, and it was trained with diverse levels of unlabeled textual corpus data to develop a strong natural language understanding base with fine-tuning and generative pre-training.
The study showed how pre-training improved the model’s zero shot performance on a variety of NLP tasks, including sentiment analysis, question answering, and schema resolution.
GPT-1 performed better than specifically trained supervised state-of-the-art models in 9 out of 12 tasks the models were compared on.
The GPT -1 model once again performed significantly better on these tasks than the prior best results, with gains of up to 8.9% on Story Cloze and 5.7% overall on RACE.
The next version of the GPT model was introduced in 2019, GPT-2 which was trained on a larger dataset and enriched with more parameters to make this model better.
The foundation for zero-shot task transfer, mentioned in GPT-2, is task conditioning.
GPT 2’s capacity to transfer zero shot tasks is intriguing.
As a special case of zero shot task transfer, zero shot learning occurs when no examples are given at all, and the model is instructed to perform the task.
Join Our Book's Discord Space
Join the book's Discord Workspace for Latest updates, Offers, Tech happenings around the world, New Release and Sessions with the Authors:
https://discord.bpbonline.com
...more
3min
January 03, 2024 Introduction of ChatGPT
Introduction of ChatGPT
—
Today's Amazon Deals - https://amzn.to/3FeoGyg
–
Introduction of ChatGPT
After redefining and expanding the structure of the existing models around various NLP tasks, openAI structured their GPT - 3.5( referred as sibling model of GPT 3.5) series into a conversational smart AI NLP system which can cater the complex NLP solutions. As the time proceeded, the GPT 3.5 witnessed some feature and optimization wise. OpenAI introduced a set of GPT 3.5 model versions which gives users a better clarity to utilize and experiment with the models according to their use cases.
Turbo: The same model family that underpins ChatGPT is Turbo. As compared to the Davinci model family, it performs similarly well on completions while being optimized for conversational chat input and output. The Turbo model family in the API ought to work well for every use case that can be handled efficiently in ChatGPT. The first model family to get frequent model upgrades like ChatGPT is the Turbo family.
Qualities: Conversation and text generation
Max. request can be made: 4,096 tokens
Trained date: Up to Sep 2021
DaVinci: The Davinci model family is the most competent and can complete any work that the other models (Ada, Curie, and Babbage) can complete, frequently with less training. Davinci will yield the greatest results for tasks requiring a deep grasp of the text, such as summarizing for a particular audience and creating original content. Davinci costs more each API request and is slower than the other models as a result of these expanded capabilities, which ask for more computational resources.
Understanding the purpose of text is another area where Davinci excels. Davinci excels at deducing solutions to various logical conundrums and illuminating character motivations. Some of the most difficult cause-and-effect AI puzzles have been cracked by Davinci.
Qualities: Complex intent, cause and effect, summarization for audience
Max. request can be made: 4,000 tokens
Trained date: Up to June 2021
Curie: Curie is incredibly strong yet moves very quickly. While Curie excels at many complex tasks like sentiment classification and summarization, Davinci is better at processing complex text. Being a general-purpose chatbot, Curie is also fairly adept at doing Q&A and answering queries.
Qualities: Language translation, complex classification, text sentiment, summarization
Babbage: Babbage is capable of simple categorization and other elementary tasks. When assessing how well documents match search queries using semantic search, it is also extremely capable.
Qualities: Moderate classification, semantic search classification
Ada: Ada is often the fastest model and is capable of finishing jobs that don’t call for a lot of detail, such text parsing, address correction, and some types of categorization tasks. The performance of Ada may frequently be enhanced by adding extra context.
Qualities: Parsing text, simple classification, address correction, keywords
ChatGPT has been made to comply with many human valued prototypes and rules. It was trained up to early 2022. The basic version of ChatGPT uses the GPT 3.5 - turbo API as the backend model which is way cheaper than many other GPT 3.5 series models to make it more affordable with users.
Timeline Summary
Date
Milestone
11th June,2018
GPT-1 announced on the OpenAI blog.
14th Feb,2019
GPT-2 announced on the OpenAI blog.
28th May,2020
Initial GPT-3 preprint paper published to arXiv.
11th Jun,2020
GPT-3 API private beta.
22th Sep,2020
GPT-3 licensed to Microsoft.
18th Nov,2021
GPT-3 API opened to the public.
27th Jan,2022
InstructGPT was released as text-davinci-002, now known as GPT-3.5. InstructGPT preprint paper Mar/2022.
28th July,2022
Exploring data-optimal models with FIM, paper on arXiv.
1st Sep,2022
GPT-3 model pricing was cut by 66% for the davinci and curie model.
21st Sep 2022
Whisper (speech recognition) announced on the OpenAI blog.
28st Nov 2022
GPT-3.5 expanded to text-davinci-003, announced via email:
Higher quality writing.
Handles more complex instructions.
3. Better at longer form content generation.
30th Nov 2022
ChatGPT announced on the OpenAI blog.
1st Feb 2023
ChatGPT hits 100 million monthly active unique users (via UBS report).
1st Mar 2023
ChatGPT API announced on the OpenAI blog.
*The timeline was extracted from the GPT blog by Dr Alan D. Thompson
...more
8min
January 03, 2024 Introduction of GPT - 3.5, InstructGPT
Introduction of GPT - 3.5, InstructGPT
—
Today's Amazon Deals - https://amzn.to/3FeoGyg
–
Introduction of GPT - 3.5, InstructGPT
One of the major issues that large language models used to face is like unfiltered AI- generated contents and responses sometimes which seem to be untruthful, toxic and irrelevant to the users. Thus, OpenAI integrated a fine-tuning with human-feedback taking stance which helps catering a wide range of tasks. This fine-tuned supervised model is trained with reinforcement learning of human feedback, which are referred as InstructGPT.
Base Framework
In InstructGPT, on the input prompt distribution, the labelers show examples of the intended behavior. These human prompts have tasks like generation, question answering, dialogue, summarization, extractions, and other natural language tasks and are majorly built on English language ( 96%). Almost 40 contractors were contributed towards human feedback and approximately 73% training labellers did synergize with each other.
Model Specifications
In the training part of instructGPT, the labelers were directed to use 3 kinds of prompts which included 1. Engage some arbitrary tasks 2. Multiple instructions and multiple queries 3. About certain corresponding solutions from random audiences from waitlisted users. And the training mechanism is made separate to train 3 different training model structures where in SFT models, datasets were trained with labellers demonstrations, likewise with rewards model and the dataset are adjusted with human interpretation of previous model output’s rankings; and the PPO models are completely fine-tuned without human interventions.
Supervised fine-tuning (SFT): In this model, the labeler data has been fed within the fine-tune mechanism for 16 epochs, using a cosine decay rate with a residual dropout 0.2.
Reward modeling (RM): The model has been trained to feed in a prompt response and get a scaler response. The difference in rewards represents the log odds that one response will be preferred to the other by a human label. In this structure they’ve trained approximately 6B RMs out of 175B
Reinforcement learning (RL): A random consumer request was presented in a bandit-style environment, and a response was expected. It generates a reward based on the prompt and answer, as defined by the reward model, and closes the episode. In order to prevent the reward model from being over optimized, they also applied a per-token KL penalty from the SFT model at each token. The RM was used to initialize the value function. These models were known as “PPO.”
Results
On the part of exploring more areas of developing the existing ecosystem of NLP models, openAI comes up with another fascinating development, which can resolve the problem of infilling. OpenAI wants to allow them to acquire excellent text infilling without compromising their ability to generate code normally from left to right. The team’s method for transforming training data is incredibly straightforward: they simply transfer a random section of text from the center to the end of a page.
The team shows that a causal AR LLM can learn to fill in the middle of a document and handle related tasks like inferring import modules, writing docstrings, and finishing functions by jointly training models on a mixture of FIM-transformed data and traditional left-to-right data on multiple objectives and datasets. Overall, the FIM models may retain the same left-to-right text capacity as standard AR models while learning how to more efficiently fill in the center – an advantage of the suggested training data transformation technique that provides FIM for free.
At 175B parameters (the davinci models, the most recent update), the InstructGPT model is preferred over GPT-3 more than 85% of the time and over GPT-3 prompted 71% of the time by human indications. This means that almost 3 out of 4 times, labelers prefer InstructGPT over a GPT-3 that has been conditioned to do well on the task at hand. Not even prompt engineering is enough to beat InstructGPT.
Figure 9.7: Evaluation of the final snapshots of models pretrained for 100B tokens without FIM and then fine-tuned for 25B (row a) and 50B (row b) tokens with FIM
[Source: InstructGPT paper]
To learn more technical aspect of GPT - 3.5, you can refer to - Training language models to follow instructions with human feedback- https://tinyurl.com/yny5uux2
Cost Reduction in GPT -3 Model API Tokens
Moving ahead with time and improvisation, chatGPT’s subscription model also witnessed a price reduction in GPT -3 series and especially in Da-Vinci model and curie model, 66% cost reduction - updated to $0.02 / 1k tokens and $0.002 / 1k tokens from $0.06 / 1k tokens and $0.006 / 1k tokens respectively. The OpenAI team kept on making amazing progress on making the model more efficient and more sustainable to lead to price reduction.
Introduction of Whisper
In the process of developing a better ecosystem of NLP domains, openAI came up with another Whisper, an automatic speech recognition which is trained on 6,80,000 hours of multilingual and multi task supervised scraping through the web. This model is designed to tackle the issue of background noise, data disturbance and making it closer to real estimation. This model also caters to a set of multi-linguistic tasks and gives out the transcripts as well. The multi-linguistic part has 98 different language data for the training purpose.
Overview of Whisper
The training dataset is made from diversified audio clips more biased towards the real life data to leverage more human-sided interpretations. The whisper AI is built on the architecture with taking mel-spectrogram of 30 secs chunks of sound wave and passing that into encoder-decoder Transformer to predict the relevant text caption, special tokens that instruct the single model to carry out tasks like language recognition, phrase-level timestamps, multilingual voice transcription, and to-English speech translation are combined in with the special tokens. It has 9 different model sizes according to size and capabilities.
Figure 9.8: The process of text processing through the training pipeline
[Source: Whisper paper]
Other current methods usually make use of larger but unsupervised audio pre-training datasets or smaller, more tightly linked audio-text training datasets. Whisper does not outperform models that specialize on LibriSpeech performance, a very competitive benchmark in speech recognition, because it was trained on a broad and varied dataset rather than being tailored to any particular one.
Figure 9.9: The encoder-decoder model of Whisper
[Source: Whisper paper]
However, it is far more reliable and commits 50% less mistakes than comparable models when we compare its zero-shot performance across a wide range of different datasets.
Whisper’s performance is close to that of professional human transcribers. The model has been tested with WER distributions of 25 recordings from the Kincaid46 dataset transcribed by Whisper, the same 4 commercial ASR systems from one computer-assisted human transcription service and 4 human transcription services and error ranges seemed to have almost similar ranges for all of them.
To learn more technical aspect of Whisper, you can refer to - Robust Speech Recognition via Large-Scale Weak Supervision- https://tinyurl.com/359y5t5y
Figure 9.10: The box plot is superimposed with dots indicating the WERs on individual recordings, and the aggregate WER over the 25 recordings are annotated on each box
[Source: Whisper paper]
...more
11min
January 03, 2024 The Introduction to GPT-3
The Introduction to GPT-3
—
Today's Amazon Deals - https://amzn.to/3FeoGyg
–
The Introduction to GPT-3
Launch date: 28th May, 2020
Right after another year of GPT- 2 launch, openAI came up with another updated and advanced version of GPT series, GPT - 3, “Language Models are Few-Shot Learners”. Open AI created the GPT-3 model with 175 billion parameters in its effort to create extremely robust and potent language models that would require little training and only a few demos to comprehend tasks and carry them out. This model featured 100 times more parameters than GPT-2 and ten times more than Microsoft’s potent Turing NLG language model. GPT-3 performs well on downstream NLP tasks in zero-shot and few-shot settings because of the numerous parameters and sizable dataset it was trained on. It may write articles that are difficult to differentiate from ones produced by people thanks to its huge capacity. It can also complete on-demand jobs that it was never expressly taught for, such as adding and subtracting numbers, generating SQL queries and codes, decoding sentences of words, writing React and JavaScript codes from a task description in natural language, etc.
Base Framework
With the text data they are trained on, large language models gain pattern detection and other abilities. The language models begin recognizing patterns in the data while they learn the core job of predicting the next word given context words, which helps them reduce the loss for the language modeling task. Eventually, the model benefits from this skill when transferring zero-shot tasks. The language model compares the pattern of the instances with what it has learned in the past for comparable data and utilizes that knowledge to carry out the tasks when given a few examples and/or a description of what needs to be done. This is a potent capacity of huge language models that gets stronger as the model’s parameter count rises.
Few, one, and zero-shot settings are specialized examples of zero-shot task transfer, as was previously stated. In a few-shot configuration, the job description and as many examples as will fit in the model’s context window are given to it. One example is given to the model in a one-shot setup, while none are given in a zero-shot configuration. The model’s few-shot, one-shot, and zero-shot capabilities all improve with increased capacity.
Figure 9.4: Image representing the context learning mechanism during training
[Source: GPT -3 paper]
Five distinct corpora were used to train the GPT-3, each with a specific weight. Good quality datasets were used to train the model over many epochs and were sampled more often. Common Crawl, WebText2, Books1, Books2, and Wikipedia were the five datasets used which included most of all the use case patterns of textual and contextual data.
Model Specifications
Again, like GPT-2, the model use in first GPT model with the transformer base but this version witnessed few major differences from GPT-2 which go like this:
GPT - 3 has been evaluated in 3 different in-context learning other than traditional fine-tuning with zero, one and few shot learning techniques.
GPT-3 has 96 layers with each layer having 96 attention heads.
Size of word embeddings was increased to 12888 for GPT-3 from 1600 for GPT-2.
Context window size was increased from 1024 for GPT-2 to 2048 tokens for GPT-3.
Adam optimiser was used with β_1=0.9, β_2=0.95 and ε= 10^(-8).
Alternating dense and locally banded sparse attention patterns were used.
Evaluation
A variety of language modeling and NLP datasets were used to test GPT-3. In a few or zero-shot situations, GPT-3 outperformed cutting-edge methods for language modeling datasets like LAMBADA and Penn Tree Bank. Although it couldn’t surpass the state-of-the-art for other datasets, it did enhance zero-shot state-of-the-art performance. On NLP tasks like closed book question answering, schema resolution, translation, etc., GPT-3 again performed well, frequently outperforming or coming close to well-tuned models.
Figure 9.5: Four methods for performing a task with a language model
[Source: GPT -3 paper]
The model performed better in few-shot settings than in one- and zero-shot settings for the majority of the tasks. A variety of language modeling and NLP datasets were used to test GPT-3. In a few or zero-shot situations, GPT-3 outperformed cutting-edge methods for language modeling datasets like LAMBADA and Penn Tree Bank. Although it couldn’t surpass the state-of-the-art for other datasets, it did enhance zero-shot state-of-the-art performance. On NLP tasks like closed book question answering, schema resolution, translation, etc. GPT-3 again performed well, frequently outperforming or coming close to well-tuned models. The model performed better in few-shot settings than in one- and zero-shot settings for the majority of the tasks. On the CoQA benchmark, 81.5 F1 in the zero-shot setting, 84.0 F1 in the one-shot setting, and 85.0 F1 in the few-shot setting, compared to the 90.7 F1 score achieved by fine-tuned SOTA. On the TriviaQA benchmark, 64.3%, 68.0%, 71.2% accuracy in the zero-shot setting, in the one-shot setting, and in the few-shot setting respectively, outperforming the state of the art (68%) by 3.2%. On the LAMBADA dataset, 76.2 %, 72.5%, 86.4% accuracy in the zero-shot setting, in the one-shot setting, and in the few-shot setting respectively, outperforming the state of the art (68%) by 18%. In addition to being assessed on traditional NLP tasks, the model was also evaluated on more artificial tasks, such as adding numbers, unscrambling words, creating news articles, learning and utilizing new terms, etc. The model performed better in the few-shot option than the one-shot and zero-shot settings for these tasks as well, with performance increasing with the number of parameters.
To learn more technical aspect of GPT - 3, you can refer to - Language Models are Few-Shot Learners- https://tinyurl.com/4ym9tehp
API Development of GPT - 3
In 2020 June, openAI released their API which offers a general-purpose “text in, text out” interface, allowing users to try it on essentially any English language job, in contrast to most AI systems that are developed for a single use-case. One may now request permission to use the API in your product, create a totally new application, or assist in researching the advantages and disadvantages of this technology.
The API will attempt to match the pattern you provided it with when given any text prompt and provide a text completion. It may be “programmed” by giving it a few samples of what you want it to accomplish; the degree of success varies typically depending on how difficult the task is. The API also enables you to improve performance on certain tasks by either learning from human input supplied by users or labelers or by training on a dataset (small or big) of samples you supply.
In September 2020, GPT-3 was integrated with Microsoft exclusively licensing the GPT-3, allowing us to leverage its technical innovations to develop and deliver advanced AI solutions for our customers, creating new potential AI solutions.
Figure 9.6: The process of input feeding in InstructGPT model or GPT 3.5
[Source: InstructGPT Paper]
At the end of 2021, OpenAI eventually made the entire GPT-3 available and its API available for all the users on public space in specified countries with an improved Playground, which makes it easy to prototype with our models, an example library with dozens of prompts to get developers started, and Codex, a new model that translates natural language into code.
...more
11min

FAQs about AhbarjietMalta:

How many episodes does AhbarjietMalta have?

The podcast currently has 1,812 episodes available.

More shows like AhbarjietMalta

DJ AKD Remixes by Dj Akd

DJ AKD Remixes

2 Listeners