TypeError: 'NoneType' object is not subscriptable. With trl==0.15.0 and later.

### 🐛 Describe the bug

After updating trl, I got `TypeError: 'NoneType' object is not subscriptable` when using Liger Kernel.
The error does to occur with `transformer.AutoModelForCausalLM`

- trl==0.14.0 => Success
- trl==0.15.0 => Fail
- trl git => Fail


Error:
```
Traceback (most recent call last):
  File "/workspaces/LLMTrain/t2.py", line 28, in <module>
    trainer.train()
  File "/opt/conda/lib/python3.11/site-packages/transformers/trainer.py", line 2241, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/transformers/trainer.py", line 2548, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/transformers/trainer.py", line 3698, in training_step
    loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/trl/trainer/sft_trainer.py", line 444, in compute_loss
    shift_logits = outputs.logits[..., :-1, :].contiguous()
                   ~~~~~~~~~~~~~~^^^^^^^^^^^^^
TypeError: 'NoneType' object is not subscriptable
  0%|          | 0/4 [00:04<?, ?it/s]
```

### Reproduce

Minimal code to reproduce the error:
```python
from datasets import Dataset
from liger_kernel.transformers import AutoLigerKernelForCausalLM
from transformers import AutoTokenizer, AutoModelForCausalLM

from trl import SFTConfig, SFTTrainer

model_id = "trl-internal-testing/tiny-Qwen2ForCausalLM-2.5"

# model = AutoModelForCausalLM.from_pretrained(model_id)
model = AutoLigerKernelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.pad_token = tokenizer.eos_token

dummy_dataset = Dataset.from_dict({"text": ["Dummy dataset"] * 16, })

training_args = SFTConfig(
    num_train_epochs=1,
    per_device_train_batch_size=4,
    report_to="none",
)
trainer = SFTTrainer(
    model=model_id,
    args=training_args,
    train_dataset=dummy_dataset,
    processing_class=tokenizer,
)

trainer.train()
```

### Versions

Environment Report:
-------------------
```
Operating System: Linux-6.8.0-52-generic-x86_64-with-glibc2.35
Python version: 3.11.10
Liger Kernel version: 0.5.3
PyTorch version: 2.5.1+cu124
CUDA version: 12.4
HIP(ROCm) version: Not available
Triton version: 3.1.0
Transformers version: 4.49.0.dev0
XPU version: XPU Not Available
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TypeError: 'NoneType' object is not subscriptable. With trl==0.15.0 and later. #568

🐛 Describe the bug

Reproduce

Versions

Environment Report:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TypeError: 'NoneType' object is not subscriptable. With trl==0.15.0 and later. #568

Description

🐛 Describe the bug

Reproduce

Versions

Environment Report:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions