Quantcast
Channel: Machine Learning

[D] Simple Questions Thread

0
0

Please post your questions here instead of creating a new thread. Encourage others who create new posts for questions to post here instead!

Thread will stay alive until next one so keep posting after the date in the title.

Thanks to everyone for answering questions in the previous thread!

submitted by /u/AutoModerator
[link] [comments]

[R] AIOS: LLM Agent Operating System

0
0
[R] AIOS: LLM Agent Operating System

Paper: https://arxiv.org/abs/2403.16971

Github: https://github.com/agiresearch/AIOS

Abstract: The integration and deployment of large language model (LLM)-based intelligent agents have been fraught with challenges that compromise their efficiency and efficacy. Among these issues are sub-optimal scheduling and resource allocation of agent requests over the LLM, the difficulties in maintaining context during interactions between agent and LLM, and the complexities inherent in integrating heterogeneous agents with different capabilities and specializations. The rapid increase of agent quantity and complexity further exacerbates these issues, often leading to bottlenecks and sub-optimal utilization of resources. Inspired by these challenges, this paper presents AIOS, an LLM agent operating system, which embeds large language model into operating systems (OS) as the brain of the OS, enabling an operating system "with soul" -- an important step towards AGI. Specifically, AIOS is designed to optimize resource allocation, facilitate context switch across agents, enable concurrent execution of agents, provide tool service for agents, and maintain access control for agents. We present the architecture of such an operating system, outline the core challenges it aims to resolve, and provide the basic design and implementation of the AIOS. Our experiments on concurrent execution of multiple agents demonstrate the reliability and efficiency of our AIOS modules. Through this, we aim to not only improve the performance and efficiency of LLM agents but also to pioneer for better development and deployment of the AIOS ecosystem in the future.

An overview of the AIOS architecture.

submitted by /u/TouchLive4686
[link] [comments]

[R] Zero Mean Leaky ReLu

0
0
[R] Zero Mean Leaky ReLu

Hi,

At the risk of groans of "not another ReLu activation function variant", I thought I'd share a simple trick to make the (Leaky)ReLu better behaved, in particular to address criticism about the (Leaky)ReLu not being zero-centred.

The simple trick is to offset the (Leaky)ReLu unit by the expectation of the output under a zero-mean normally distributed input:

Zero Mean Leaky ReLu:

y(x) = max(x, a*x) - k

k=((1 - a)*s)/sqrt(2*pi)

y' = a, for y<-k, 1 otherwise

The resulting activation function is still cheap to compute. It also seems to make the vanilla ReLu (a=0) better behaved.

The standard deviation should be chosen based on what you expect it to be given your weight initialisation scheme. If in doubt, s=1 is a good start.

I'm currently working on a paper on sparse optimisation, and this small offset improved the margin by which my model beat current state-of-the-art. However, since it's not actually part of the core innovation, I thought I'd share!

Mark

https://preview.redd.it/ksasmdsuooqc1.png?width=258&format=png&auto=webp&s=7113f32a906304563ed99be0c23c525cbde4be6f

Example graph for a=1/10, s=1

https://preview.redd.it/2y10rttv8pqc1.png?width=653&format=png&auto=webp&s=64cdaeb0dca6efca5b97a71a59ad28a88160e316

submitted by /u/1nyouendo
[link] [comments]

PyTorch Dataloader Optimizations [D]

0
0

What are some optimizations that one could use for the data loader in PyTorch? The data type could be anything. But I primarily work with images and text. We know you can define your own. But does anyone have any clever tricks to share? Thank you in advance!

submitted by /u/MuscleML
[link] [comments]

[D] Is Synthetic Data a Reliable Option for Training Machine Learning Models?

0
0

"The most obvious advantage of synthetic data is that it contains no personally identifiable information (PII). Consequently, it doesn’t pose the same cybersecurity risks as conventional data science projects. However, the big question for machine learning is whether this information is reliable enough to produce functioning ML models."

Very informative blog regarding Using Synthetic Data in Machine Learning, source here https://opendatascience.com/is-synthetic-data-a-reliable-option-for-training-machine-learning-models/

submitted by /u/Data_Nerd1979
[link] [comments]

[D] Seeking Advice

0
0

I'm currently pursuing my undergraduate degree in robotics engineering and have been immersing myself in concepts related to machine learning, deep learning, and computer vision, both modern and traditional. With strong programming skills and a habit of regularly reading research papers, I'm eager to understand the job landscape in my field and pursue a Phd. Are there ample opportunities available? What can I expect in terms of salaries and future prospects? Additionally, I'm curious about the comparative job market between natural language processing (NLP) and computer vision. Given my background and interests, what areas or skills should I focus on learning to enhance my career prospects? Thanks in advance for your time and advice.

submitted by /u/MD24IB
[link] [comments]

[D] Dataloading from external disk

0
0

Hey there,

I am training a deep lesrning model using a dataset of 400Go in an external SSD disk and I noticed that training is very slow, any tricks to make dataloading faster ?

PS : I have to use the external disk

submitted by /u/bkffadia
[link] [comments]

[D] Data cleaning for classification model

0
0

Currently working on a classification model, which entails data cleaning. We've got 8000 images categorized into 3 classes. After removing duplicates and corrupted images, what else should we consider?

submitted by /u/fardin__khan
[link] [comments]

[R] Paper (NAACL 2024): why LLMs cannot be used for everyday fact checking, on the reversal problem, on the solution to the reversal problem, and a lot more

0
0

You can find the paper here: https://arxiv.org/abs/2403.18671

Here is the list of things that you can find in the paper:

- We reveal that large commercial language models cannot be used for every day fact checking tasks.

- We argue that evaluating the fact checking pipeline across websites does not fully demonstrate model transferability, and instead, propose a straightforward way to repurpose existing datasets for the task.

- We empirically show that when the fact checking pipeline is trained on out-of-domain genre of claims is not as competitive as being trained on in-domain genre of claims.

- We propose a novel adversarial method for the claim retriever.

- We report that language models (including the large models), are unable to infer the premise, given a hypothesis, even if they are trained on the premise to predict the correctness of the hypothesis (if it holds).

- We use the finding above to propose a straightforward augmentation method to enhance the performance of claim reader in the fact checking pipeline.

Fun fact about our paper: our paper along another at the same time were submitted to ICLR 2024. Both, our paper and the other paper, which I mentioned, reported the reversal problem in LLMs. But our paper also proposed a solution to the problem. Furthermore! We did all these in only one section of our paper, and we offered a lot more than this in the other sections.

But what was the outcome!?? Our paper was about to get rejected (we withdrew it to avoid it), and the other paper was easily got accepted :))))

#broken_system

submitted by /u/payam_ka
[link] [comments]

[D] Seeking guidance/advice

0
0

Hi, I've finished Andrew Ng's course on Coursera. I think I've got the basics.

I've started learning ML for my master's thesis. I want to develop a method to estimate scope 3 emissions. I studied business and I do not have any python background except for a 6-month data analytics bootcamp.

I've got the data needed for my thesis, but when I try to work on it, I'm not sure what I'm doing, and ofc a sh*t ton of bugs and errors. Do I need to just keep trying to push through and learn through the experience by working on my thesis or do I need to study more? I've been considering to by a book <\Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow> by Aurelien Geron.

Any guidance/recommendation would be much appreciated!

submitted by /u/qheeeee
[link] [comments]

[D] State of the art TTS

0
0

State of the art Tts question

Hey! I'm currently working on a project and I'd like to implement speech using TTS, I tried many things and I can't seem to find something that fits my needs, I haven't worked on TTS for a while now so I was wondering if maybe they were newer technologies I could use.

Here is what I'm looking for :

I need to be be quite fast and without too many sound artifacts (I tried bark and while the possibility of manipulating emotion is quite remarkable the generated voice is full of artifacts and noise)

It'd be a bonus if I could stream the audio and pipe it through other things, I'd like to apply an RVC Model on top of it (live)

Another 'nice to have' is to have some controls over the emotions or tone of the voice.

I tried these so far (either myself or through demos) :

TORTOISETTS and EDGETTS seem to have a nice voice quality but are relatively monotone.

Bark as I said is very good at emotions and controls but lots of artifacts in the voice, if I have time I'd try to apply postprocessing but idk to what extent it can help

OpenAI models don't have much emotions IMO Same as eleven labs

I used Uber duck in the past but it seems a lot of fun functionalities disappeared.

If you have any advice, suggestion or if you think I should try somethings further feel free to reply!

I also want to thanks everyone in advance!

Have a nice day!

submitted by /u/Zireaone
[link] [comments]

[P] Run AI & ML workflows locally from your Mac desktop

0
0
[P] Run AI & ML workflows locally from your Mac desktop

Hi all - I wanted to share an app I’ve been working on with a small team over the past year that I thought this community would be interested in. Odyssey is a completely native Mac app for creating remarkable art, getting work done, and automating repetitive tasks with the power of AI and machine learning models.
We just made a major feature update and added the ability to create your own Widgets. Odyssey Widgets are fully interactive mini applications that live in their own windows or panels and are driven by a workflow. This means you can take a workflow you create with Odyssey and add it directly to your desktop. So, as an example, you could generate an image, chat with locally run chatbot, run bulk image processing, etc. straight from your desktop without even opening the Odyssey app. Widgets can be built with Odyssey and triggered from the Odyssey logo in your Mac’s menu.

https://i.redd.it/8s9s6i0clvqc1.gif

We're in public beta but here's a full list of everything Odyssey supports:
Image generation and processing

  • Run Stable Diffusion 1.5, SDXL, SDXL Lightning, and SDXL Turbo locally or connect your Stable Diffusion API key
  • Add custom models & LoRAs
  • ControlNet support including canny edges, pose detection, depth estimation, and QR Code Monster
  • Inpainting and outpainting
  • Super resolution models (Best Buddy GAN, Ultrasharp 4x, Remacri, and ESRGAN)
  • Multiple image segmentation models
  • Erase objects
  • Dozens of image processing nodes including aspect ratio, resizing, and extracting dominant colors
  • Custom image transitions for powerful slideshows

Large language models and math equations

  • Run Llama2 locally or connect your ChatGPT API key
  • Supports both chatbot mode and instructions mode
  • Solver node for word problems and math nodes for complex equations
  • Lots of updates coming here in the next few weeks

Automation and batch workflows

  • Batch image and text nodes support hundreds of images and lines of text at once
  • Remove backgrounds, upscale, change aspect ratios, and run dozens of image processors in bulk

Private, customizable, and shareable

  • No images, chats, or inputs are stored or accessible by the Odyssey team
  • Completely private and secure. The only tracking is anonymized usage data to help us improve Odyssey
  • Process your own data entirely locally
  • No internet connection required to run local models
  • Use your own API keys for ChatGPT and Stable Diffusion
  • Easily save and share custom workflows

What’s coming soon:

  • Custom LLMs & more text processing nodes - we are adding support for bringing in custom LLMs, document uploads, and more

  • Batch text and workflow automation - we are building in document upload, batch text support, and an integration with Apple shortcuts

  • Plug-in support - we are opening up the Odyssey to 3P developers. If you’re interested, please reach out - would love to learn more from you as we work on building this out

Feel free to reach out to [john@odysseyapp.io](mailto:john@odysseyapp.io) if you have any questions or feedback.

submitted by /u/creatorai
[link] [comments]

[D] Seeking Advice: Transitioning to Low-Level Implementations in AIoT Systems - Where to Start?

0
0

Hello everyone, I'm a prospective graduate student who will be starting my studies in September this year, specializing in AIoT (Artificial Intelligence of Things) Systems. Recently, I've been reading papers from journals like INFOCOM and SIGCOMM, and I've noticed that they mostly focus on relatively low-level aspects of operating systems, including GPU/CPU scheduling, optimization of deep learning model inference, operator optimization, cross-platform migration, and deployment. I find it challenging to grasp the implementation details of these works at the code level. When I looked at the implementations of these works uploaded on GitHub, I found it relatively difficult to understand. My primary programming languages are Java and Python. During my undergraduate studies, I gained proficiency in implementing engineering projects and ideas using Python, especially in the fields of deep learning and machine learning. However, I lack experience and familiarity with C/C++ (many of the aforementioned works are based on C/C++). Therefore, I would like to ask for advice from senior professionals and friends on which areas of knowledge I should focus on. Do I need to learn CUDA programming, operating system programming, or other directions? Any recommended learning paths would be greatly appreciated.

PS: Recently, I have started studying the MIT 6.S081 Operating System Engineering course.

Thank you all sincerely for your advice.

submitted by /u/MaTwickenham
[link] [comments]

[D] How do you measure performance of AI copilot/assistant?

[P] Insta Face Swap


[P] Visualize RAG Data

0
0

Hey all, I've recently published a tutorial at Towards Data Science that explores a somewhat overlooked aspect of Retrieval-Augmented Generation (RAG) systems: the visualization of documents and questions in the embedding space: https://towardsdatascience.com/visualize-your-rag-data-evaluate-your-retrieval-augmented-generation-system-with-ragas-fc2486308557

While much of the focus in RAG discussions tends to be on the algorithms and data processing, I believe that visualization can help to explore the data and to gain insights into problematic subgroups within the data.

This might be interesting for some of you, although I'm aware that not everyone is keen on this kind of visualization. I believe it can add a unique dimension to understanding RAG systems.

submitted by /u/DocBrownMS
[link] [comments]

[D] Help finding an AI website

0
0

There's a website posted here in r/ML where it's a website that compiles all of the best products suggested by each subreddit, for example, earphones, the AI website will list and rank the top models and brands of the best and reviewed products made by Redditors. I can't find the website for the life of me.

submitted by /u/vertigondriac
[link] [comments]

[P] Hybrid-Net: Real-time audio source separation, generate lyrics, chords, beat.

0
0

Project: https://github.com/DoMusic/Hybrid-Net

A transformer-based hybrid multimodal model, various transformer models address different problems in the field of music information retrieval, these models generate corresponding information dependencies that mutually influence each other.

An AI-powered multimodal project focused on music, generate chords, beats, lyrics, melody, and tabs for any song.

submitted by /u/CheekProfessional146
[link] [comments]

[P] deit3-jax: A codebase for training ViTs on TPUs

0
0

Hey all, I have written a codebase to train ViTs by following DeiT and DeiT-III recipes. As they are strong baselines to train vanilla ViTs, it is necessary to reproduce to adopt to the variant research. However, the original repository is implemented in PyTorch, it is impossible to run on TPUs.

Therefore I re-implemented the simple ViT training codebase with DeiT and DeiT-III training recipes. Here is my repository: https://github.com/affjljoo3581/deit3-jax. I used Jax/Flax and webdataset to build a TPU-friendly training environment. Below are the reproduction results:

DeiT Reproduction

NameDataResolutionEpochsTimeReimpl.OriginalConfigWandbModel
T/16in1k2243002h 40m73.1%72.2%configlogckpt
S/16in1k2243002h 43m79.68%79.8%configlogckpt
B/16in1k2243004h 40m81.46%81.8%configlogckpt

DeiT-III on ImageNet-1k

NameDataResolutionEpochsTimeReimpl.OriginalConfigWandbModel
S/16in1k2244002h 38m80.7%80.4%configlogckpt
S/16in1k2248005h 19m81.44%81.4%configlogckpt
B/16in1k192 → 2244004h 42m83.6%83.5%pt / ftpt / ftpt / ft
B/16in1k192 → 2248009h 28m83.91%83.8%pt / ftpt / ftpt / ft
L/16in1k192 → 22440014h 10m84.62%84.5%pt / ftpt / ftpt / ft
L/16in1k192 → 224800--84.9%pt / ft--
H/14in1k154 → 22440019h 10m85.12%85.1%pt / ftpt / ftpt / ft
H/14in1k154 → 224800--85.2%pt / ft--

DeiT-III on ImageNet-21k

NameDataResolutionEpochsTimeReimpl.OriginalConfigWandbModel
S/16in21k224907h 30m83.04%82.6%pt / ftpt / ftpt / ft
S/16in21k22424020h 6m83.39%83.1%pt / ftpt / ftpt / ft
B/16in21k2249012h 12m85.35%85.2%pt / ftpt / ftpt / ft
B/16in21k22424033h 9m85.68%85.7%pt / ftpt / ftpt / ft
L/16in21k2249037h 13m86.83%86.8%pt / ftpt / ftpt / ft
L/16in21k224240--87%pt / ft--
H/14in21k126 → 2249035h 51m86.78%87.2%pt / ftpt / ftpt / ft
H/14in21k126 → 224240---pt / ft--

I trained all models on TPU v4-64 Pod slice, provided by the TRC program. I uploaded the checkpoints to the huggingface hub and you can also see the training logs on wandb. For more details, please check out my repository.

submitted by /u/affjljoo3581
[link] [comments]

[D] What is the state-of-the-art for 1D signal cleanup?

0
0

I have the following problem. Imagine I have a 'supervised' dataset of 1D curves with inputs and outputs, where the input is a modulated noisy signal and the output is the cleaned desired signal. Is there a consensus in the machine learning community on how to tackle this simple problem? Have you ever worked on anything similar? What algorithm did you end up using? Example: https://imgur.com/JYgkXEe

submitted by /u/XmintMusic
[link] [comments]

[D]Evaluating xG Models: Comparing Discrete Outcomes with Continuous Predictions

0
0

I've recently developed an xG (expected goals) model using event data, and I'm exploring the best methods for evaluating its accuracy. Given the nature of football, where goals are discrete (or if we look at each shot, it is a binary outcome) but my model predicts a continuous probability range (0,1).

I'm curious about the most appropriate statistical techniques or metrics for comparison, rather than just MSE/RMSE. How do you assess the accuracy of your xG models under these conditions? Any advice or references on this topic would be greatly appreciated.

submitted by /u/tipoviento
[link] [comments]

[D] What are some of the big tech company sponsored ML research websites that you are aware of for constantly keeping up with the ML research and workings behind their products, like Apple Machine Learning Research (https://machinelearning.apple.com/) or Tesla's AI day videos?

0
0

It would be great if there were a bundle of such sources or if you have a go to place where you keep up to date with all the new research going on.

submitted by /u/pontiac_RN
[link] [comments]

[N] The 77 French legal codes are now available via Hugging Face's Datasets library with daily updates

0
0

This groundwork enables ecosystem players to consider deploying RAG solutions in real time without having to configure data retrieval systems.

Link to Louis Brulé-Naudet's Hugging Face profile

```python import concurrent.futures import logging

from datasets from tqdm import tqdm

def dataset_loader( name:str, streaming:bool=True ) -> datasets.Dataset: """ Helper function to load a single dataset in parallel.

Parameters ---------- name : str Name of the dataset to be loaded. streaming : bool, optional Determines if datasets are streamed. Default is True. Returns ------- dataset : datasets.Dataset Loaded dataset object. Raises ------ Exception If an error occurs during dataset loading. """ try: return datasets.load_dataset( name, split="train", streaming=streaming ) except Exception as exc: logging.error(f"Error loading dataset {name}: {exc}") return None 

def load_datasets( req:list, streaming:bool=True ) -> list: """ Downloads datasets specified in a list and creates a list of loaded datasets.

Parameters ---------- req : list A list containing the names of datasets to be downloaded. streaming : bool, optional Determines if datasets are streamed. Default is True. Returns ------- datasets_list : list A list containing loaded datasets as per the requested names provided in 'req'. Raises ------ Exception If an error occurs during dataset loading or processing. Examples -------- >>> datasets = load_datasets(["dataset1", "dataset2"], streaming=False) """ datasets_list = [] with concurrent.futures.ThreadPoolExecutor() as executor: future_to_dataset = {executor.submit(dataset_loader, name): name for name in req} for future in tqdm(concurrent.futures.as_completed(future_to_dataset), total=len(req)): name = future_to_dataset[future] try: dataset = future.result() if dataset: datasets_list.append(dataset) except Exception as exc: logging.error(f"Error processing dataset {name}: {exc}") return datasets_list 

req = [ "louisbrulenaudet/code-artisanat", "louisbrulenaudet/code-action-sociale-familles", "louisbrulenaudet/code-assurances", "louisbrulenaudet/code-aviation-civile", "louisbrulenaudet/code-cinema-image-animee", "louisbrulenaudet/code-civil", "louisbrulenaudet/code-commande-publique", "louisbrulenaudet/code-commerce", "louisbrulenaudet/code-communes", "louisbrulenaudet/code-communes-nouvelle-caledonie", "louisbrulenaudet/code-consommation", "louisbrulenaudet/code-construction-habitation", "louisbrulenaudet/code-defense", "louisbrulenaudet/code-deontologie-architectes", "louisbrulenaudet/code-disciplinaire-penal-marine-marchande", "louisbrulenaudet/code-domaine-etat", "louisbrulenaudet/code-domaine-etat-collectivites-mayotte", "louisbrulenaudet/code-domaine-public-fluvial-navigation-interieure", "louisbrulenaudet/code-douanes", "louisbrulenaudet/code-douanes-mayotte", "louisbrulenaudet/code-education", "louisbrulenaudet/code-electoral", "louisbrulenaudet/code-energie", "louisbrulenaudet/code-entree-sejour-etrangers-droit-asile", "louisbrulenaudet/code-environnement", "louisbrulenaudet/code-expropriation-utilite-publique", "louisbrulenaudet/code-famille-aide-sociale", "louisbrulenaudet/code-forestier-nouveau", "louisbrulenaudet/code-fonction-publique", "louisbrulenaudet/code-propriete-personnes-publiques", "louisbrulenaudet/code-collectivites-territoriales", "louisbrulenaudet/code-impots", "louisbrulenaudet/code-impots-annexe-i", "louisbrulenaudet/code-impots-annexe-ii", "louisbrulenaudet/code-impots-annexe-iii", "louisbrulenaudet/code-impots-annexe-iv", "louisbrulenaudet/code-impositions-biens-services", "louisbrulenaudet/code-instruments-monetaires-medailles", "louisbrulenaudet/code-juridictions-financieres", "louisbrulenaudet/code-justice-administrative", "louisbrulenaudet/code-justice-militaire-nouveau", "louisbrulenaudet/code-justice-penale-mineurs", "louisbrulenaudet/code-legion-honneur-medaille-militaire-ordre-national-merite", "louisbrulenaudet/livre-procedures-fiscales", "louisbrulenaudet/code-minier", "louisbrulenaudet/code-minier-nouveau", "louisbrulenaudet/code-monetaire-financier", "louisbrulenaudet/code-mutualite", "louisbrulenaudet/code-organisation-judiciaire", "louisbrulenaudet/code-patrimoine", "louisbrulenaudet/code-penal", "louisbrulenaudet/code-penitentiaire", "louisbrulenaudet/code-pensions-civiles-militaires-retraite", "louisbrulenaudet/code-pensions-retraite-marins-francais-commerce-peche-plaisance", "louisbrulenaudet/code-pensions-militaires-invalidite-victimes-guerre", "louisbrulenaudet/code-ports-maritimes", "louisbrulenaudet/code-postes-communications-electroniques", "louisbrulenaudet/code-procedure-civile", "louisbrulenaudet/code-procedure-penale", "louisbrulenaudet/code-procedures-civiles-execution", "louisbrulenaudet/code-propriete-intellectuelle", "louisbrulenaudet/code-recherche", "louisbrulenaudet/code-relations-public-administration", "louisbrulenaudet/code-route", "louisbrulenaudet/code-rural-ancien", "louisbrulenaudet/code-rural-peche-maritime", "louisbrulenaudet/code-sante-publique", "louisbrulenaudet/code-securite-interieure", "louisbrulenaudet/code-securite-sociale", "louisbrulenaudet/code-service-national", "louisbrulenaudet/code-sport", "louisbrulenaudet/code-tourisme", "louisbrulenaudet/code-transports", "louisbrulenaudet/code-travail", "louisbrulenaudet/code-travail-maritime", "louisbrulenaudet/code-urbanisme", "louisbrulenaudet/code-voirie-routiere" ]

dataset = load_datasets( req=req, streaming=True
) ```

submitted by /u/louisbrulenaudet
[link] [comments]

[D] Machine Learning On The Edge

0
0
[D] Machine Learning On The Edge

Hi guys, I found it today in my drawer. I forgot I had it and have never used it. Then it came to mind how is the current state of ML on the edge and are your predictions for the near future. We usually see big advances and news on big models but not much on applications on device.

submitted by /u/TheLastMate
[link] [comments]

[N] Introducing DBRX: A New Standard for Open LLM

[D] Are data structures and leetcode needed for Machine Learning Researcher/Engineer jobs and interviews?





Latest Images