Wednesday, May 1, 2024

Software Development Life Cycle

The Software Development Life Cycle (SDLC) refers to a methodology with clearly defined processes for creating high-quality software. in detail, the SDLC methodology focuses on the following phases of software development:

  • Requirement analysis

  • Planning

  • Software design such as architectural design

  • Software development

  • Testing

  • Deployment

This article will explain how SDLC works, dive deeper in each of the phases, and provide you with examples to get a better understanding of each phase.

What is the software development life cycle?

SDLC or the Software Development Life Cycle is a process that produces software with the highest quality and lowest cost in the shortest time possible. SDLC provides a well-structured flow of phases that help an organization to quickly produce high-quality software which is well-tested and ready for production use.

The SDLC involves six phases as explained in the introduction. Popular SDLC models include the waterfall model,spiral model, and Agile model.

How the SDLC Works

SDLC works by lowering the cost of software development while simultaneously improving quality and shortening production time. SDLC achieves these apparently divergent goals by following a plan that removes the typical pitfalls of software development projects. That plan starts by evaluating existing systems for deficiencies.

Next, it defines the requirements of the new system. It then creates the software through the stages of analysis, planning, design, development, testing, and deployment. By anticipating costly mistakes like failing to ask the end-user or client for feedback, SLDC can eliminate redundant rework and after-the-fact fixes.

It’s also important to know that there is a strong focus on the testing phase. As the SDLC is a repetitive methodology, you have to ensure code quality at every cycle. Many organizations tend to spend few efforts on testing while a stronger focus on testing can save them a lot of rework, time, and money. Be smart and write the right types of tests.

Stages and Best Practices

Following the best practices and/or stages of SDLC ensures the process works in a smooth, efficient, and productive way.

1. Identify the Current Problems 

“What are the current problems?” This stage of the SDLC means getting input from all stakeholders, including customers, salespeople, industry experts, and programmers. Learn the strengths and weaknesses of the current system with improvement as the goal.

2. Plan

“What do we want?” In this stage of the SDLC, the team determines the cost and resources required for implementing the analyzed requirements. It also details the risks involved and provides sub-plans for softening those risks.

In other words, the team should determine the feasibility of the project and how they can implement the project successfully with the lowest risk in mind.

3. Design

“How will we get what we want?” This phase of the SDLC starts by turning the software specifications into a design plan called the Design Specification. All stakeholders then review this plan and offer feedback and suggestions. It’s crucial to have a plan for collecting and incorporating stakeholder input into this document. Failure at this stage will almost certainly result in cost overruns at best and the total collapse of the project at worst.

4. Build

“Let’s create what we want.”

At this stage, the actual development starts. It’s important that every developer sticks to the agreed blueprint. Also, make sure you have proper guidelines in place about the code style and practices.

For example, define a nomenclature for files or define a variable naming style such as camelCase. This will help your team to produce organized and consistent code that is easier to understand but also to test during the next phase.

5. Code Test

“Did we get what we want?” In this stage, we test for defects and deficiencies. We fix those issues until the product meets the original specifications.

In short, we want to verify if the code meets the defined requirements.

6. Software Deployment

“Let’s start using what we got.”

At this stage, the goal is to deploy the software to the production environment so users can start using the product. However, many organizations choose to move the product through different deployment environments such as a testing or staging environment.

This allows any stakeholders to safely play with the product before releasing it to the market. Besides, this allows any final mistakes to be caught before releasing the product.

The most common SDLC examples or SDLC models are listed below.

Waterfall Model

This SDLC model is the oldest and most straightforward. With this methodology, we finish one phase and then start the next. Each phase has its own mini-plan and each phase “waterfalls” into the next. The biggest drawback of this model is that small details left incomplete can hold up the entire process.

Agile Model

The Agile SDLC model separates the product into cycles and delivers a working product very quickly. This methodology produces a succession of releases. Testing of each release feeds back info that’s incorporated into the next version. According to Robert Half, the drawback of this model is that the heavy emphasis on customer interaction can lead the project in the wrong direction in some cases.

Iterative Model

This SDLC model emphasizes repetition. Developers create a version very quickly and for relatively little cost, then test and improve it through rapid and successive versions. One big disadvantage here is that it can eat up resources fast if left unchecked.

V-Shaped Model

An extension of the waterfall model, this SDLC methodology tests at each stage of development. As with waterfall, this process can run into roadblocks.

Big Bang Model

This high-risk SDLC model throws most of its resources at development and works best for small projects. It lacks the thorough requirements definition stage of the other methods.

Spiral Model

The most flexible of the SDLC models, the spiral model is similar to the iterative model in its emphasis on repetition. The spiral model goes through the planning, design, build and test phases over and over, with gradual improvements at each pass.

Benefits of the SDLC

SDLC done right can allow the highest level of management control and documentation. Developers understand what they should build and why. All parties agree on the goal upfront and see a clear plan for arriving at that goal. Everyone understands the costs and resources required.

Several pitfalls can turn an SDLC implementation into more of a roadblock to development than a tool that helps us. Failure to take into account the needs of customers and all users and stakeholders can result in a poor understanding of the system requirements at the outset. The benefits of SDLC only exist if the plan is followed faithfully.

 

 

Computer Careers


It’s safe to say that now is a great time to pursue a career in technology. In fact, the Bureau of Labor Statistics (BLS) projects employment of professionals in computer and information technology careers to increase much faster than average.Let us introduce you to some of the many computer jobs worth considering.

1. Big data engineer

Big Data Engineers spend their work days communicating with business users and data scientists with the goal of translating business objectives into workable data-processing workflows. These positions require a robust knowledge of statistics, experience with programming and the ability to design and implement working solutions for common big data challenges.

2. Applications architect

Digital pros who land a position like applications architect are required to maintain a high level of technical expertise while also excelling in areas like planning, coordination, communication and teamwork. These professionals are tasked with designing major aspects of the architecture of an application, providing technical leadership to the development team, performing design and code reviews, ensuring enterprise-wide application standards are met and more.

3. Web developer

A web developer is the person responsible for the building and maintenance of a website. Because a website is often an organization’s primary public-facing property, it’s important for web developers to understand business needs and how to build sites that accommodate them. This position requires an in-depth knowledge of internet protocols and development languages like PHP, JavaScript, HTML5 and CSS as well.

4. Database administrator

Put simply, database administrators use specialized software to securely store and organize data, ensuring that data is both available to users and secure from unauthorized access. Additional duties include identifying user needs to create and administer databases, testing and making modifications to database structure as needed and merging old databases into new ones. Because many databases contain personal and financial information, database administrators make security a top priority.

5. Computer hardware engineer

Computer Hardware Engineers are tasked with designing, developing and supervising the production of computer hardware like keyboards, modems, printers, computer systems, chips and circuit boards. Their duties are similar to those of electronics engineers, although they focus exclusively on computer technology, with tasks like designing blueprints of new equipment, making models of new hardware designs, upgrading current computer equipment and supervising the manufacturing of new hardware.

6. Computer software engineer

Conversely, computer software engineers focus their work on designing and developing software used to control computers by utilizing the principles of computer science and mathematical analysis. You’ll typically see this work materialize through the design and development of computer games, word processors, operating systems and compilers.

7. Data security analyst

Data Security Analysts use their thorough understanding of computer and network security, including aspects like firewall administration, encryption technologies and network protocols, to perform security audits and risk assessments on behalf of their organizations, make recommendations for enhancing system security, research attempted breaches of data security, rectify security weaknesses and formulate security policies and procedures. They’re also expected to stay up-to-date with industry security trends and relevant government regulations.

8. Information systems security manager

Quality information system security managers provide leadership, guidance and training to information systems security personnel. This requires a strong technical background in tandem with excellent interpersonal and management skills. Typical duties include reviewing, implementing, updating and documenting information security policies and procedures within an organization, in addition to ensuring that legal and contractual security and privacy mandates are adhered to.

9. Health information technology careers

While jobs in technology continue to rise in prevalence, so too do jobs in the healthcare field.Health Information technology (HIT) is an excellent way to blend the two, launching your career in the midst of rapid growth. The HIT field is a specialized subset of information technology professionals who work for medical facilities and other healthcare organizations to increase the efficiency and quality of clinical care through technology. You can expect to encounter tech positions in healthcare environments that center on elements like electronic billing and coding systems, electronic medical records and networks for digital imaging.

10. Statistician

Quality statisticians are needed in a litany of different industries, including economics, government, business, biology, engineering, politics, public health, medicine, psychology, marketing, education and even sports. They collect, analyze and present data using their knowledge of statistics to conduct surveys, opinion polls and more. Statisticians use their technical skills to determine how to best collect information, what groups to test, what questions to ask and how to interpret and publish their findings.

11. Mathematician

Mathematicians use mathematical theory, algorithms and computational methods to answer questions relating to everything from economics, business and engineering to physics and other sciences. They are often tasked with both utilizing existing mathematical theories and developing new ones to connect previously unknown relationships between mathematical concepts. Our digital landscape’s influx of technical capabilities has catapulted mathematicians’ abilities to develop data-driven solutions for real-world problems.

12. Business intelligence analyst

Business intelligence analyst are responsible for designing and developing data analysis and reporting solutions, communicating analysis results while making recommendations to senior management teams and developing data cleansing regulations for their organizations. In order to land this computer career, you’ll need to have a strong background in database technology, analytical reporting and languages like SQL and Python, while also demonstrating excellent written and oral communication skills.

13. Computer and information research scientist

These high skilled professionals spend their work days inventing and designing new approaches to computing technology while also discovering innovative uses for technologies that already exist. Computer and information research scientists can work in business, science, medicine and other fields to study and solve complex problems, creating and improving computer software and hardware.

14. Network architect

Network Architect use their backgrounds in networking technology to assess business and application requirements for phone, data and internet systems for an organization. This includes planning, designing and upgrading network installation projects, troubleshooting network architecture, making recommendations for system enhancements and maintaining backup, version-control and defense systems. Network architects help business leaders make informed decisions on network investments that fit their short and long-term needs.

15. Systems engineer

This experienced tech position requires the ability to communicate complex information to technical and nontechnical users, relying on an in-depth knowledge of the technology in use, as well as advanced analytical, troubleshooting and design skills. Systems Engineers are charged with developing, maintaining and supporting technical infrastructure, hardware and system software components, while also providing user support across multiple infrastructure platforms.

16. Computer support specialist

The important work of computer supprort specialists is pretty accurately explained by the title itself. They provide help and advice to computer users and organizations by offering technical assistance directly to computer users. This includes regular testing, troubleshooting and overall maintenance of existing network systems.

17. Mobile application developer

Mobile Application Developers specialize in coding, testing, debugging and monitoring mobile apps. They use their strong analytical and programming skills to contribute to the development of ongoing projects, recommending changes and enhancements to software applications as needed. Most mobile application developer positions will require previous experience building mobile applications across a number of different platforms, in addition to knowledge of common mobile development languages.

Monday, April 29, 2024

Common Computer, IT, and Technology Abbreviations


There are literally thousands of computer abbreviations out there. Many are concerned with the technical aspects of the computer, while others deal with personal communication. Following are some more common ones that you may have heard but do not know exactly what they mean.

Common Computer Abbreviations

Operating Systems and Data Storage

Two of the most basic components of any computer system are its operating system and data storage. Many acronyms reflect these basic requirements.

  • AFA - This acronym stands for All Flash Array, a grouping of flash memory devices that helps boost performance.

  • BIOS - This is the Basic Input Output System, which controls the computer, telling it what operations to perform. These instructions are on a chip that connects to the motherboard.

  • BYTE - A byte is a storage unit for data. KB is a kilobyte (1024 bytes); MB is a megabyte (1 million bytes); and GB is a gigabyte (1000 megabytes).

  • CPU - This stands for the Central Processing Unit of the computer. This is like the computer's brain.

  • HDD - This is an acronym for Hard Disk Drive, the traditional spinning drives that store information.

  • LCD - This stands for Liquid Crystal Display, a type of computer screen.

  • MAC - This is an abbreviation for Macintosh, which is a type of personal computer made by the Apple Computer company.

  • OS - This is the Operating System of the computer. It is the main program that runs on a computer and begins automatically when the computer is turned on.

  • PC - This is the abbreviation for personal computer. It originally referred to computers that were IBM compatible.

  • PDF - This represents the Portable Document Format, which displays files in a format that is ready for the web.

  • RAID - A type of storage that can be configured in different ways to provide a redundant copy of files, RAID stands for Redundant Array of Independent Disks.

  • RAM - This stands for Random Access Memory, which is the space inside the computer that can be accessed at one time. If you increase the amount of RAM, then you will increase the computer's speed. This is because more of a particular program is able to be loaded at one time.

  • RDMA - This stands for Remote Direct Memory Access.

  • ROM - This is Read Only Memory, which is the instruction for the computer and cannot be altered.

  • SATA - This stands for Serial Advanced Technology Attachment, a type of hard drive technology.

  • SDS - This stands for Software-Defined Storage, a type of data storage that separates the data from its associated software.

  • SSD - This acronym stands for Solid State Drive, a more modern type of hard drive that has no moving parts.

  • VGA - The Video Graphics Array is a system for displaying graphics. It was developed by IBM.

Internet, Networking, and Connectivity

Moving beyond the physical computer right in front of you, acronyms related to networking and getting online are equally numerous and diverse.

  • DNS - This stands for Domain Name Server; this can help recognize an IP address used by a domain name.

  • FTP - This is a service called File Transport Protocol, which moves a file between computers using the Internet.

  • HTML - HyperText Markup Language formats information so it can be transported on the Internet.

  • HTTP - Hypertext Transfer Protocol is a set of instructions for the software that controls the movement of files on the Internet.

  • IP - This stands for Internet Protocol which is the set of rules that govern the systems connected to the Internet. IP Address is a digital code specific to each computer that is hooked up to the Internet.

  • ISP - The Internet Service Provider is the company which provides Internet service so you can connect your computer to the Internet.

  • LAN - This stands for Local Area Network, which consists of the servers that your computer connects to in your local area.

  • PPP - Point-to-Point Protocol is the set of rules that allow your computer to use the Internet protocols using a phone line and modem.

  • SEO - This is an acronym for Search Engine Optimization.

  • URL - This is the Uniform Resource Locator, which is a path to a certain file on the World Wide Web. It is what you may also call the web address.

  • USB - The Universal Serial Bus is used for communications between certain devices. It can connect keyboards, cameras, printers, mice, flash drives, and other devices. Its use has expanded from personal computers to smartphones and video games, and is used as a power cord to connect devices to a wall outlet to charge them.

  • VR - Virtual Reality simulates a three-dimensional scene on the computer and has the capability of interaction. This is widely used in gaming.

  • VRML - Virtual Reality Mark-up Language allows the display of 3D images.

  • WYSIWYG - This initialism stands for What You See Is What You Get. It is pronounced "wizziwig" and basically means that the printer will print what you see on your monitor. It also describes web design programs where what you see in the program is how the website will appear to the end user.

Common Cloud Computing Abbreviations

Cloud computing has its own set of acronyms and IT abbreviations that can easily confuse people. These are a few of the common ones:

  • BYOC - This stands for Bring Your Own Cloud, often referring to cloud-based file-sharing software.

  • IaaS - This acronym stands for Infrastructure as a Service. It means a service that provides data storage and servers remotely for clients.

  • SaaS - This stands for Software as a Service and refers to on-demand software stored in the cloud.

  • VDI - This abbreviation stands for Virtual Desktop Infrastructure and refers to a virtual desktop that can be accessed by several users.

  • VPN - This stands for Virtual Private Network and is used to represent a secure connection between a network and a user.

Common AI Abbreviations

AI, which stands for Artificial Intelligence, is becoming more common all the time. From your phone to the way you interact with your television, people encounter AI every day. These are some of the common acronyms and abbreviations that go with it:

  • AI - Short for artificial intelligence, AI refers to the development of computer systems to perform tasks like speech recognition and object assessment.

  • ASR - This stands for Automatic Speech Recognition and refers to computers’ ability to understand your speech.

  • DL - This acronym stands for Deep Learning, referring to complicated tasks that require many layers of integration in a machine’s neural network.

  • FKP - This stands for Facial Key Points, the places software looks to recognize faces.

  • ML - ML stands for Machine Learning, the ability of a machine to learn and integrate new information.

Common Email and Chat Abbreviations

Email and web communication require certain abbreviations of their own, including symbols that stand in for facial expressions.

Email Abbreviations

You’ll also see some of these email abbreviations used in texting and online messaging.

  • AFK - Away From Keyboard

  • BC - Blind Copy

  • CIAO - Check It All Out

  • GAL - Get A Life

  • GMTA - Great Minds Think Alike

  • J4F - Just For Fun

  • KISS - Keep it Simple, Stupid

  • LOL - Laughing Out Loud

  • Re: - Regarding

  • TIC - Tongue In Cheek

  • TL;DR - Too Long, Didn't Read

Emoticons

Instead of abbreviating a word or phrase, emoticons attempt to resemble the visual expressions on a human face.

  • :) or :-) - Smiley face

  • :.( - Crying face

  • :-> - Grinning

  • :-| - Indifferent or bored

  • :-( - Sad face

  • ;-) - Winking

  • :-O - Yelling

Friday, April 26, 2024

Introduction


A basic understanding of networking is important for anyone managing a server. Not only is it essential for getting your services online and running smoothly, it also gives you the insight to diagnose problems.

This document will provide a basic overview of some common networking concepts. Here we will discuss basic terminologies.

This guide is operating system agnostic, but should be very helpful when implementing features and services that utilize networking on your server.

Networking Glossary

Before we begin discussing networking with any depth, we must define some common terms that you will see throughout this guide, and in other guides and documentation regarding networking.

These terms will be expanded upon in the appropriate sections that follow:

  • Connection: In networking, a connection refers to pieces of related information that are transfered through a network. This generally infers that a connection is built before the data transfer (by following the procedures laid out in a protocol) and then is deconstructed at the at the end of the data transfer.

  • Packet: A packet is, generally speaking, the most basic unit that is transfered over a network. When communicating over a network, packets are the envelopes that carry your data (in pieces) from one end point to the other.

Packets have a header portion that contains information about the packet including the source and destination, timestamps, network hops, etc. The main portion of a packet contains the actual data being transfered. It is sometimes called the body or the payload.

  • Network Interface: A network interface can refer to any kind of software interface to networking hardware. For instance, if you have two network cards in your computer, you can control and configure each network interface associated with them individually.

A network interface may be associated with a physical device, or it may be a representation of a virtual interface. The “loopback” device, which is a virtual interface to the local machine, is an example of this.

  • LAN: LAN stands for “local area network”. It refers to a network or a portion of a network that is not publicly accessible to the greater internet. A home or office network is an example of a LAN.

  • WAN: WAN stands for “wide area network”. It means a network that is much more extensive than a LAN. While WAN is the relevant term to use to describe large, dispersed networks in general, it is usually meant to mean the internet, as a whole.

If an interface is said to be connected to the WAN, it is generally assumed that it is reachable through the internet.

  • Protocol: A protocol is a set of rules and standards that basically define a language that devices can use to communicate. There are a great number of protocols in use extensively in networking, and they are often implemented in different layers.

Some low level protocols are TCP, UDP, IP, and ICMP. Some familiar examples of application layer protocols, built on these lower protocols, are HTTP (for accessing web content), SSH, TLS/SSL, and FTP.

  • Port: A port is an address on a single machine that can be tied to a specific piece of software. It is not a physical interface or location, but it allows your server to be able to communicate using more than one application.

  • Firewall: A firewall is a program that decides whether traffic coming into a server or going out should be allowed. A firewall usually works by creating rules for which type of traffic is acceptable on which ports. Generally, firewalls block ports that are not used by a specific application on a server.

  • NAT: NAT stands for network address translation. It is a way to translate requests that are incoming into a routing server to the relevant devices or servers that it knows about in the LAN. This is usually implemented in physical LANs as a way to route requests through one IP address to the necessary backend servers.

  • VPN: VPN stands for virtual private network. It is a means of connecting separate LANs through the internet, while maintaining privacy. This is used as a means of connecting remote systems as if they were on a local network, often for security reasons

Wednesday, April 24, 2024

Electronic Commerce


E-Commerce or Electronic Commerce means buying and selling of goods, products, or services over the internet. E-commerce is also known as electronic commerce or internet commerce. These services provided online over the internet network. Transaction of money, funds, and data are also considered as E-commerce. These business transactions can be done in four ways: Business to Business (B2B), Business to Customer (B2C), Customer to Customer (C2C), Customer to Business (C2B). The standard definition of E-commerce is a commercila transaction which is happened over the internet. Online stores like Amazon, Flipkart, Shopify, Myntra, Ebay, Quikr, Olx are examples of E-commerce websites. By 2020, global retail e-commerce can reach up to $27 Trillion. Let us learn in detail about what is the advantages and disadvantages of E-commerce and its types.

E-Commerce or Electronic Commerce

E-commerce is a popular term for electronic commerce or even internet commerce. The name is self-explanatory, it is the meeting of buyers and sellers on the internet. This involves the transaction of goods and services, the transfer of funds and the exchange of data.

Types of E-Commerce Models

Electronic commerce can be classified into four main categories. The basis for this simple classification is the parties that are involved in the transactions. So the four basic electronic commerce models are as follows,

1. Business to Business

This is Business to Business transactions. Here the companies are doing business with each other. The final consumer is not involved. So the online transactions only involve the manufacturers,whole salers and retailers etc.

2. Business to Consumer

Business to Consumer. Here the company will sell their goods and/or services directly to the consumer. The consumer can browse their websites and look at products, pictures, read reviews. Then they place their order and the company ships the goods directly to them. Popular examples are Amazon, Flipkart, Jabong etc.

3. Consumer to Consumer

Consumer to consumer, where the consumers are in direct contact with each other. No company is involved. It helps people sell their personal goods and assets directly to an interested party. Usually, goods traded are cars, bikes, electronics etc. OLX, Quikr etc follow this model.

4. Consumer to Business

This is the reverse of B2C, it is a consumer to business. So the consumer provides a good or some service to the company. Say for example an IT freelancer who demos and sells his software to a company. This would be a C2B transaction.

Advantages of E-Commerce

  • E-commerce provides the sellers with a global reach. They remove the barrier of place (geography). Now sellers and buyers can meet in the virtual world, without the hindrance of location.

  • Electronic commerce will substantially lower the transaction cost. It eliminates many fixed costs of maintaining brick and mortar shops. This allows the companies to enjoy a much higher margin of profit.

  • It provides quick delivery of goods with very little effort on part of the customer. Customer complaints are also addressed quickly. It also saves time, energy and effort for both the consumers and the company.

  • One other great advantage is the convenience it offers. A customer can shop 24×7. The website is functional at all times, it does not have working hours like a shop.

  • Electronic commerce also allows the customer and the business to be in touch directly, without any intermediaries. This allows for quick communication and transactions. It also gives a valuable personal touch.

Disadvantages of E-Commerce

  • The start-up costs of the e-commerce portal are very high. The setup of the hardware and the software, the training cost of employees, the constant maintenance and upkeep are all quite expensive.

  • Although it may seem like a sure thing, the e-commerce industry has a high risk of failure. Many companies riding the dot-com wave of the 2000s have failed miserably. The high risk of failure remains even today.

  • At times, e-commerce can feel impersonal. So it lacks the warmth of an interpersonal relationship which is important for many brands and products. This lack of a personal touch can be a disadvantage for many types of services and products like interior designing or the jewelry business.

  • Security is another area of concern. Only recently, we have witnessed many security breaches where the information of the customers was stolen. Credit card theft, identity theft etc. remain big concerns with the customers.

  • Then there are also fulfillment problems. Even after the order is placed there can be problems with shipping, delivery, mix-ups etc. This leaves the customers unhappy and dissatisfied.

Tuesday, April 23, 2024

Programming language


Computer programming languages allow us to give instructions to a computer in a language the computer understands. Just as many human-based languages exist, there are an array of computer programming languages that programmers can use to communicate with a computer. The portion of the language that a computer can understand is called a “binary.” Translating programming language into binary is known as “compiling.” Each language, from C Language to Python, has its own distinct features, though many times there are commonalities between programming languages.

These languages allow computers to quickly and efficiently process large and complex swaths of information. For example, if a person is given a list of randomized numbers ranging from one to ten thousand and is asked to place them in ascending order, chances are that it will take a sizable amount of time and include some errors.

Language Types

 

Machine and assembly languages

A machine language consists of the numeric codes for the operations that a particular computer can execute directly. The codes are strings of 0s and 1s, or binary digits (“bits”), which are frequently converted both from and to hexadecimal (base 16) for human viewing and modification. Machine language instructions typically use some bits to represent operations, such as addition, and some to represent operands, or perhaps the location of the next instruction. Machine language is difficult to read and write, since it does not resemble conventional mathematical notation or human language, and its codes vary from computer to computer.

Assembly language is designed to be easily translated into machine language. Although blocks of data may be referred to by name instead of by their machine addresses, assembly language does not provide more sophisticated means of organizing complex information. Like machine language, assembly language requires detailed knowledge of internal computer architecture. It is useful when such details are important, as in programming a computer to interact with peripheral devices (printers, scanners, storage devices, and so forth).

Algorithmic languages

Algorithmic languages are designed to express mathematical or symbolic computations. They can express algebraic operations in notation similar to mathematics and allow the use of subprograms that package commonly used operations for reuse. They were the first high-level languages.

FORTRAN

The first important algorithmic language was FORTRAN(formula translation), designed in 1957 by an IBM team led by John Backus. It was intended for scientific computations with real numbers and collections of them organized as one- or multidimensional arrays. Its control structures included conditional IF statements, repetitive loops (so-called DO loops), and a GOTO statement that allowed nonsequential execution of program code. FORTRAN made it convenient to have subprograms for common mathematical operations, and built libraries of them.

FORTRAN was also designed to translate into efficient machine language. It was immediately successful and continues to evolve.

 ALGOL

ALGOL (algorithmic language) was designed by a committee of American and European computer scientists during 1958–60 for publishing algorithms, as well as for doing computations. Like LISP (described in the next section), ALGOL had recursive subprograms—procedures that could invoke themselves to solve a problem by reducing it to a smaller problem of the same kind. ALGOL introduced block structure, in which a program is composed of blocks that might contain both data and instructions and have the same structure as an entire program. Block structure became a powerful tool for building large programs out of small components.

ALGOL contributed a notation for describing the structure of a programming language, Backus–Naur Form, which in some variation became the standard tool for stating the syntax(grammar) of programming languages. ALGOL was widely used in Europe, and for many years it remained the language in which computer algorithms were published. Many important languages, such as Pascal and Ada (both described later), are its descendants.

LISP

LISP(lisprocessing) was developed about 1960 by John McCarthy at the Massachusetts Institute of technology (MIT) and was founded on the mathematical theory of recursive functions (in which a function appears in its own definition). A LISP program is a function applied to data, rather than being a sequence of procedural steps as in FORTRAN and ALGOL. LISP uses a very simple notation in which operations and their operands are given in a parenthesized list. For example, (+ a (* b c)) stands for a + b*c. Although this appears awkward, the notation works well for computers. LISP also uses the list structure to represent data, and, because programs and data use the same structure, it is easy for a LISP program to operate on other programs as data.

LISP became a common language for artificial Intelligence (AI) programming, partly owing to the confluence of LISP and AI work at MIT and partly because AI programs capable of “learning” could be written in LISP as self-modifying programs. LISP has evolved through numerous dialects, such as Scheme and Common LISP.

C

The C programming language was developed in 1972 by Dennis Richtie and Brian Kernighan at the AT&T Corporation for programming computer operating systems. Its capacity to structure data and programs through the composition of smaller units is comparable to that of ALGOL. It uses a compact notation and provides the programmer with the ability to operate with the addresses of data as well as with their values. This ability is important in systems programming, and C shares with assembly language the power to exploit all the features of a computer’s internal architecture. C, along with its descendant C++, remains one of the most common languages.

 

Business-oriented languages

 

COBOL

COBOL (common business oriented language) has been heavily used by businesses since its inception in 1959. A committee of computer manufacturers and users and U.S. government organizations established CODASYL (Committee on Data Systems and Languages) to develop and oversee the language standard in order to ensure its portability across diverse systems.

COBOL uses an English-like notation—novel when introduced. Business computations organize and manipulate large quantities of data, and COBOL introduced the record data structures for such tasks. A record clusters heterogenous data—such as a name, an ID number, an age, and an address—into a single unit. This contrasts with scientific languages, in which homogenous arrays of numbers are common. Records are an important example of “chunking” data into a single object, and they appear in nearly all modern languages.

Computer programming language

SQL

SQL (structured query language) is a language for specifying the organization of databases (collections of records). Databases organized with SQL are called relational, because SQL provides the ability to query a database for information that falls in a given relation. For example, a query might be “find all records with both last name  Smith and city New York.” Commercial database programs commonly use an SQL-like language for their queries.

Education-oriented languages

BASIC

BASIC (beginner’s all-purpose symbolic instruction code) was designed at Dartmouth College in the mid-1960s by John Kemeny and Thomas Kurtz. It was intended to be easy to learn by novices, particularly non-computer science majors, and to run well on a time-sharing computer with many users. It had simple data structures and notation and it was interpreted: a BASIC program was translated line-by-line and executed as it was translated, which made it easy to locate programming errors.

Its small size and simplicity also made BASIC a popular language for early personal computers. Its recent forms have adopted many of the data and control structures of other contemporary languages, which makes it more powerful but less convenient for beginners.

PASCAL

About 1970 Niklaus Wirth of Switzerland designed Pascal to teach structured programming, which emphasized the orderly use of conditional and loop control structures without GOTO statements. Although Pascal resembled ALGOL in notation, it provided the ability to define data types with which to organize complex information, a feature beyond the capabilities of ALGOL as well as FORTRAN and COBOL. User-defined data types allowed the programmer to introduce names for complex data, which the language translator could then check for correct usage before running a program.

During the late 1970s and ’80s, Pascal was one of the most widely used languages for programming instruction. It was available on nearly all computers, and, because of its familiarity, clarity, and security, it was used for production software as well as for education. 

HyperTalk

HyperTalk was designed as “programming for the rest of us” by Bill Atkinson for Apple’s Macintosh. Using a simple English-like syntax, Hypertalk enabled anyone to combine text, graphics, and audio quickly into “linked stacks” that could be navigated by clicking with a mouse on standard buttons supplied by the program. Hypertalk was particularly popular among educators in the 1980s and early ’90s for classroom multimedia presentations. Although Hypertalk had many features of object-oriented languages (described in the next section), Apple did not develop it for other computer platforms and let it languish; as Apple’s market share declined in the 1990s, a new cross-platform way of displaying multimedia left Hypertalk all but obsolete (see the section World Wide Web Display Languages ).

Object Oriented languages

Object-oriented languages help to manage complexity in large programs. Objects package data and the operations on them so that only the operations are publicly accessible and internal details of the data structures are hidden. This information hiding made large-scale programming easier by allowing a programmer to think about each part of the program in isolation. In addition, objects may be derived from more general ones, “inheriting” their capabilities. Such an object hierarchy made it possible to define specialized objects without repeating all that is in the more general ones.

Object-oriented programming began with the Simula language (1967), which added information hiding to ALGOL. Another influential object-oriented language was Smalltalk (1980), in which a program was a set of objects that interacted by sending messages to one another. 

C++

The C++ language, developed by Bjarne Stroustrup at AT&T in the mid-1980s, extended C by adding objects to it while preserving the efficiency of C programs. It has been one of the most important languages for both education and industrial programming. Large parts of many operating systems were written in C++. C++, along with Java, has become popular for developing commercial software packages that incorporate multiple interrelated applications. C++ is considered one of the fastest languages and is very close to low-level languages, thus allowing complete control over memory allocation and management. This very feature and its many other capabilities also make it one of the most difficult languages to learn and handle on a large scale.

C#

C# (pronounced C sharp like the musical note) was developed by Anders Hejlsberg at Microsoft in 2000. C# has a syntax similar to that of C and C++ and is often used for developing games and applications for the Microsoft Windows Operating System.

Java

In the early 1990s, Java was designed by Sun Microsystem,Inc., as a programming language for the World Wide Web (WWW). Although it resembled C++ in appearance, it was fully object-oriented. In particular, Java dispensed with lower-level features, including the ability to manipulate data addresses, a capability that is neither desirable nor useful in programs for distributed systems. In order to be portable, Java programs are translated by a Java Virtual Machine specific to each computer platform, which then executes the Java program. In addition to adding interactive capabilities to the internet through Web “applets,” Java has been widely used for programming small and portable devices, such as mobile telephones.

Visual Basic was developed by Microsoft to extend the capabilities of BASIC by adding objects and “event-driven” programming: buttons, menus, and other elements of graphical user interfaces (GUIs). Visual Basic can also be used within other Microsoft software to program small routines. Visual Basic was succeeded in 2002 by Visual Basic .NET, a vastly different language based on C#, a language with similarities to C++.

Python

The open-source language Python was developed by Dutch programmer Guido van Rossum in 1991. It was designed as an easy-to-use language, with features such as using indentation instead of brackets to group statements. Python is also a very compact language, designed so that complex jobs can be executed with only a few statements. In the 2010s, Python became one of the most popular programming languages, along with Java and JavaScript.

 

Thursday, April 18, 2024

Data processing


Data processing occurs when data is collected and translated into usable information. Usually performed by a data scientist or team of data scientists, it is important for data processing to be done correctly as not to negatively affect the end product, or data output.

Data processing starts with data in its raw form and converts it into a more readable format (graphs, documents, etc.), giving it the form and context necessary to be interpreted by computers and utilized by employees throughout an organization.

Six stages of data processing

1. Data collection

Collecting data is the first step in data processing. Data is pulled from available sources, including data lakes and data warehoses. It is important that the data sources available are trustworthy and well-built so the data collected (and later used as information) is of the highest possible quality.

2. Data preparation

Once the data is collected, it then enters the data prepration stage. Data preparation, often referred to as “pre-processing” is the stage at which raw data is cleaned up and organized for the following stage of data processing. During preparation, raw data is diligently checked for any errors. The purpose of this step is to eliminate bad data (redundant, incomplete, or incorrect data) and begin to create high-quality data for the best business intelligence.

3. Data input

The clean data is then entered into its destination (perhaps a CRM like Salesforce or a data warehouse like redshift), and translated into a language that it can understand. Data input is the first stage in which raw data begins to take the form of usable information.

4. Processing

During this stage, the data inputted to the computer in the previous stage is actually processed for interpretation. Processing is done using machine learning algorithms, though the process itself may vary slightly depending on the source of data being processed (data lakes, social networks, connected devices etc.) and its intended use (examining advertising patterns, medical diagnosis from connected devices, determining customer needs, etc.).

5. Data output/interpretation

The output/interpretation stage is the stage at which data is finally usable to non-data scientists. It is translated, readable, and often in the form of graphs, videos, images, plain text, etc.). Members of the company or institution can now begin to self-serve the data  for their own data analytics projects.

6. Data storage

The final stage of data processing is storage . After all of the data is processed, it is then stored for future use. While some information may be put to use immediately, much of it will serve a purpose later on. Plus, properly stored data is a necessity for compliance with data protection legislation like GDPR. When data is properly stored, it can be quickly and easily accessed by members of the organization when needed.

Data processing can be defined by the following steps

  • Data capture, or data collection,
  • Data storage,
  • Data conversion (changing to a usable or uniform format),
  • Data cleaning and error removal,
  • Data validation (checking the conversion and cleaning),
  • Data separation and sorting (drawing patterns, relationships, and creating subsets),
  • Data summarization and aggregation (combining subsets in different groupings for more information),
  • Data presentation and reporting.

There are different types of data processing techniques, depending on what the data is needed for. Types of data processing at a bench level may include:

  • Statistical,
  • Algebraical,
  • Mapping and plotting,
  • Forest and tree method,
  • Machine learning,
  • Linear models,
  • Non-linear models,
  • Relational processing, and
  • Non-relational processing.

These are methodology and techniques which can be applied within the key types of data processing.

What we’re going to discuss in this article is the five main hierarchical types of data processing. Or, in other words, the overarching types of systems in data analytics.

Data Processing by Application Type

The first two key types of data processing I’m going to talk about are scientific data processing and commercial data processing.

1. Scientific Data Processing

When used in scientific study or research and development work, data sets can require quite different methods than commercial data processing.

Scientific data is a special type of data processing that is used in academic and research fields.

It’s vitally important for scientific data that there are no significant errors that contribute to wrongful conclusions. Because of this, the cleaning and validating steps can take a considerably larger amount of time than for commercial data processing.

Scientific data processing needs to draw conclusions, so the steps of sorting and summarization often need to be performed very carefully, using a wide variety of processing tools to ensure no selection biases or wrong relationships are produced.

Scientific data processing often needs a topic expert additional to a data expert to work with quantities.

2. Commercial Data Processing

Commercial data processing has multiple uses, and may not necessarily require complex sorting. It was first used widely in the field of marketing, for customer relationship management applications, and in banking, billing, and payroll functions.

Most of the data caught in these applications is standardized and somewhat error proofed. That is capture fields eliminate errors, so in some cases, raw data can be processed directly, or with minimum and largely automated error checking.

Commercial data processing usually applies standard relational databases and uses batch processing. However, some, in particular, technology applications may use non-relational databases.

There are still many applications within commercial data processing that lean towards a scientific approach, such as predictive market research. These may be considered a hybrid of the two methods.

Data Processing Types by Processing Method

Within the main areas of scientific and commercial processing, different methods are used for applying the processing steps to data. The three main types of data processing we’re going to discuss are automatic/manual, batch, and real-time data processing.

3. Automatic versus Manual Data Processing

It may not seem possible, but even today people still use manual data processing. Bookkeeping data processing functions can be performed from a ledger, customer surveys may be manually collected and processed, and even spreadsheet-based data processing is now considered somewhat manual. In some of the more difficult parts of data processing, a manual component may be needed for intuitive reasoning.

The first technology that led to the development of automated systems in data processing was punch cards used in census counting. Punch cards were also used in the early days of payroll data processing.

The Rise of Computers for Data Processing

Computers started being used by corporations in the 1970s when electronic data processing began to develop. Some of the first applications for automated data processing in the way of specialized databases were developed for customer relationship management (CRM) to drive better sales.

Electronic data management became widespread with the introduction of the personal computer in the 1980s. Spreadsheets provided simple electronic assistance for even everyday data management functions such as personal budgeting and expense allocations.

Database Management

Database management provided more automation of data processing functions, which is why I refer to spreadsheets as a now rather manual tool in data management. The user is required to manipulate all the data in a spreadsheet, almost like a manual system, only the calculations are automated. Whereas in a database, users can extract data relationships and reports relatively easily, providing the setup and entries are correctly managed.

Autonomous databases now look to be a data processing method of the future, especially in the commercial data processing. Oracle and Peloton are poised to offer users more automation with what is termed a “self-driving” database.

This development in the field of automatic data processing, combined with machine learning tools for optimizing and improving service, aims to make accessing and managing data easier for end-users, without the need for highly specialized data professionals in-house.

4. Batch Processing

To save computational time, before the widespread use of distributed systems architecture, or even after it, stand-alone computer systems apply batch processing techniques. This is particularly useful in financial applications or where data requires additional layers of security, such as medical records.

Batch processing completes a range of data processes as a batch by simplifying single commands to provide actions to multiple data sets. 

This is a little like the comparison of a computer spreadsheet to a calculator in some ways. A calculation can be applied with one function, that is one step, to a whole column or series of columns, giving multiple results from one action. The same concept is achieved in batch processing for data. A series of actions or results can be achieved by applying a function to a whole series of data. In this way, computer processing time is far less.

Batch processing can complete a queue of tasks without human intervention, and data systems may program priorities to certain functions or set times when batch processing can be completed.

Banks typically use this process to execute transactions after the close of business, where computers are no longer involved in data capture and can be dedicated to processing functions.

5. Real-Time Data Processing

For commercial uses, many large data processing applications require real-time processing. That is they need to get results from data exactly as it happens. One application of this that most of us can identify with is tracking stock market and currency trends. The data needs to be updated immediately since investors buy in real-time and prices update by the minute. Data on airline schedules and ticketing and GPS tracking applications in transport services have similar needs for real-time updates.

Stream Processing

The most common technology used in real-time processing is stream processing. The data analytics are drawn directly from the stream, that is, at the source. Where data is used to draw conclusions without uploading and transforming, the process is much quicker.

Data Virtualization

Data virtualization techniques are another important development in real-time data processing, where the data remains in its source form, the only information is pulled for the needs of data processing. The beauty of data virtualization is where transformation is not necessary, it is not done, so the error margin is reduced.

Data virtualization and stream processing mean that data analytics can be drawn in real-time much quicker, benefiting many technical and financial applications, reducing processing times and errors.

Other than these popular Data processing Techniques there are three more processing techniques which are mentioned below-

6. Online Processing

This data processing technique is derived from Automatic data processing. This technique is now known as immediate or irregular access handling. Under this technique, the activity by the framework is prepared at the time of operation/processing. It can be viewed easily with the continuous preparation of data sets. This processing method highlights the fast contribution of the exchange of data and connects directly with the databases.

7. Multi Processing

This is the most commonly used data processing technique. However, it is used all over the globe where we have computer-based setups for Data capture and processing.

As the name suggests – Multiprocessing is not bound to one single CPU but has a collection of several CPUs. As the various set of processing devices are included in this method, therefore the outcome efficiency is very useful.

The tasks are broken into frames and then sent to the multiprocessors for processing. The result obtained is expected to be in less time and the output is increased. The additional benefit is every processing unit is independent, thus failure of any will not impact the working of other processing units.

8. Time Sharing

This kind of Data processing is entirely based on time. In this, one unit of processing data is used by several users. Each user is allocated with the set timings on which they need to work on the same CPU/processing Unit.

Intervals are divided into segments, and thus to users, so there is no collapse of timings which makes it a multi-access system. This processing technique is also widely used and mostly entertained in startups.

Quick Tips to Analyze Best Processing Techniques

  1. Understanding your requirement is a major point before choosing the best processing techniques for your Project.

  2. Filter your data in a much more precise manner so you can apply processing techniques.

  3. Recheck your filtered data again in a way that it still represents the first requirement and you don’t miss out on any important fields in it.

  4. Think about the OUTPUT which you would like to have so you can follow your idea.

  5. Now you have the filter data and the output you wish to have, check the best and most reliable processing technique.

  6. Once you choose your technique as per your requirement, it will be easy to follow up for the end result.

  7. The chosen technique must be checked simultaneously so there are no loopholes in order to avoid mistakes.

  8. Always apply ETL functions to recheck your datasets.

  9. With this, don’t forget to apply a timeline to your requirement, as without a specific timeline, it is useless to apply energy.

  10. Test your OUTPUT again with the initial requirement for better delivery.

Data Link Layer

In the OSI model, the data link layer is a 4 th  layer from the top and 2 nd  layer from the bottom. The communication channel t...